robust video-on-demand streaming in peer-to-peer environments

14
Robust video-on-demand streaming in peer-to-peer environments Tai T. Do a, * , Kien A. Hua a , Mounir A. Tantaoui b a School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA b School of Science and Engineering, Al Akhawayn University in Ifrane, B.P. 1890, Avenue Hassan II, Ifrane 53000, Morocco Abstract This paper presents a new video-on-demand streaming technique in peer-to-peer (P2P) environments. While a number of P2P live video streaming techniques have been proposed in the past, we argue that the two types of video streaming, live and on-demand, have some subtle differences. Most notably, a P2P video-on-demand streaming technique has to handle the asynchronous arrival of peers effi- ciently, and provide robust recovery under the rather frequent peers’ failure. Our answer to the challenge is an application multicast tree, called P2VoD (Peer-To-peer for Video-On-Demand streaming). P2VoD proposes a number of ideas, including a caching scheme, a gen- eration concept, and a distributed directory service. Through analytical analysis, we show that P2VoD is sound and efficient. We also compare P2VoD against a recently proposed P2Cast system by Guo et al. [Y. Guo, K. Suh, J.F. Kurose, D.F. Towsley, P2cast: peer-to- peer patching scheme for vod service., in: WWW, 2003, pp. 301–309] using both analytical analysis and simulation. The result shows that P2VoD performs better than P2Cast in a number of important performance metrics. Published by Elsevier B.V. Keywords: Video-on-demand; Peer-to-peer; Failure recovery 1. Introduction We are interested in the problem of Video-on-Demand (VoD) streaming using peer-to-peer (P2P) approach. P2P approach can be used to address several serious problems posed in existing VoD systems including (1) the infeasibil- ity of IP Multicast; (2) network bottleneck at the video ser- ver; and (3) the high maintenance/deployment of dedicated overlay routers. Recently, there have been several research projects on live streaming using P2P approach [1–4]. How- ever, applying these techniques into VoD streaming is not a trivial task due to the following subtle differences between the two types of streaming. First, end-to-end delay is more important to live streaming than VoD streaming. In live streaming, the shorter the end-to-end delay is, the more lively the stream is perceived by the users (defined as live- ness in [1]). In VoD streaming, liveness is simply irrelevant because the video stream is already pre-recorded. This fact implies that while a short tree rooted at the video server and spanned over peers is desirable in live streaming, it is not a necessary condition for the case of VoD streaming. Second, a peer joining an on going live streaming session is only interested in the stream starting from his/her joining time, while in the VoD streaming case the whole video must be delivered to the new peer. As such, a good VoD system must find an efficient way to provide the initial miss- ing part of the video to the latecomers. Moreover, the cor- relations between various variables are different for the two types of streaming. For example, a peer will likely stop watching a VoD stream when its QoS degrades, but the peer may not do the same thing for a live stream because he/she does not have an option of watching it again in the future [5]. Therefore, it is expected that if the QoS of the video stream reduces, there will be many more peers leaving the system in VoD streaming case than the case of live streaming. This last observation stretches the impor- tance of a robust failure recovery protocol in a VoD streaming system. Based on the above discussion, we consider the follow- ing as dominant problems for P2P VoD streaming: 0140-3664/$ - see front matter Published by Elsevier B.V. doi:10.1016/j.comcom.2007.08.024 * Corresponding author. Tel.: +1 3219450060. E-mail addresses: [email protected] (T.T. Do), [email protected] (K.A. Hua), [email protected] (M.A. Tantaoui). www.elsevier.com/locate/comcom Computer Communications xxx (2007) xxx–xxx Please cite this article in press as: T.T. Do et al., Robust video-on-demand streaming in peer-to-peer environments, Comput. Com- mun. (2007), doi:10.1016/j.comcom.2007.08.024 ARTICLE IN PRESS

Upload: ronny72

Post on 24-Jan-2015

626 views

Category:

Documents


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Robust video-on-demand streaming in peer-to-peer environments

ARTICLE IN PRESS

www.elsevier.com/locate/comcom

Computer Communications xxx (2007) xxx–xxx

Robust video-on-demand streaming in peer-to-peer environments

Tai T. Do a,*, Kien A. Hua a, Mounir A. Tantaoui b

a School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USAb School of Science and Engineering, Al Akhawayn University in Ifrane, B.P. 1890, Avenue Hassan II, Ifrane 53000, Morocco

Abstract

This paper presents a new video-on-demand streaming technique in peer-to-peer (P2P) environments. While a number of P2P livevideo streaming techniques have been proposed in the past, we argue that the two types of video streaming, live and on-demand, havesome subtle differences. Most notably, a P2P video-on-demand streaming technique has to handle the asynchronous arrival of peers effi-ciently, and provide robust recovery under the rather frequent peers’ failure. Our answer to the challenge is an application multicast tree,called P2VoD (Peer-To-peer for Video-On-Demand streaming). P2VoD proposes a number of ideas, including a caching scheme, a gen-eration concept, and a distributed directory service. Through analytical analysis, we show that P2VoD is sound and efficient. We alsocompare P2VoD against a recently proposed P2Cast system by Guo et al. [Y. Guo, K. Suh, J.F. Kurose, D.F. Towsley, P2cast: peer-to-peer patching scheme for vod service., in: WWW, 2003, pp. 301–309] using both analytical analysis and simulation. The result shows thatP2VoD performs better than P2Cast in a number of important performance metrics.Published by Elsevier B.V.

Keywords: Video-on-demand; Peer-to-peer; Failure recovery

1. Introduction

We are interested in the problem of Video-on-Demand

(VoD) streaming using peer-to-peer (P2P) approach. P2Papproach can be used to address several serious problemsposed in existing VoD systems including (1) the infeasibil-ity of IP Multicast; (2) network bottleneck at the video ser-ver; and (3) the high maintenance/deployment of dedicatedoverlay routers. Recently, there have been several researchprojects on live streaming using P2P approach [1–4]. How-ever, applying these techniques into VoD streaming is not atrivial task due to the following subtle differences betweenthe two types of streaming. First, end-to-end delay is moreimportant to live streaming than VoD streaming. In livestreaming, the shorter the end-to-end delay is, the morelively the stream is perceived by the users (defined as live-ness in [1]). In VoD streaming, liveness is simply irrelevantbecause the video stream is already pre-recorded. This fact

0140-3664/$ - see front matter Published by Elsevier B.V.

doi:10.1016/j.comcom.2007.08.024

* Corresponding author. Tel.: +1 3219450060.E-mail addresses: [email protected] (T.T. Do), [email protected]

(K.A. Hua), [email protected] (M.A. Tantaoui).

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

implies that while a short tree rooted at the video serverand spanned over peers is desirable in live streaming, it isnot a necessary condition for the case of VoD streaming.Second, a peer joining an on going live streaming sessionis only interested in the stream starting from his/her joiningtime, while in the VoD streaming case the whole videomust be delivered to the new peer. As such, a good VoDsystem must find an efficient way to provide the initial miss-ing part of the video to the latecomers. Moreover, the cor-relations between various variables are different for the twotypes of streaming. For example, a peer will likely stopwatching a VoD stream when its QoS degrades, but thepeer may not do the same thing for a live stream becausehe/she does not have an option of watching it again inthe future [5]. Therefore, it is expected that if the QoS ofthe video stream reduces, there will be many more peersleaving the system in VoD streaming case than the caseof live streaming. This last observation stretches the impor-tance of a robust failure recovery protocol in a VoDstreaming system.

Based on the above discussion, we consider the follow-ing as dominant problems for P2P VoD streaming:

emand streaming in peer-to-peer environments, Comput. Com-

Page 2: Robust video-on-demand streaming in peer-to-peer environments

2 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

• Robust failure recovery: Failures are expected to occurmore often in a P2P system. The failure recovery proto-col should reconnect the abandoned peers smoothly andquickly, so that there are no loss of frame (jitter) and nolong delay (glitch) at client’s playback. In addition, theeffect of the failure should be localized to ensure thatonly a small number of peers is affected.

• Quick join: The system must allow a new peer to join thesystem fast. The shorter the joining time is, the better(shorter) the startup delay for a peer is.

• Effective handling of clients’ asynchronous requests: Thejoining requests of peers arrive to the system at differenttimes. It is expected that the system must deliver thevideo in full-length to every peer without making theserver become a bottleneck.

• Small control overhead: Control messages, i.e., non-datamessages, may be used in the system to facilitate systemmanagement. The control overhead must be kept smallto make the system scalable.

We propose the P2VoD technique to address theabove challenges. P2VoD only assumes IP unicast atthe network layer. Each peer in P2VoD has a FIFO buf-

fer (First-In First-Out buffer) to cache the most recentcontent of the video stream it receives to cope with asyn-chronous peer arrival. Specifically, the peer buffer allowsearly peers to assist the server in responding to laterpeers; existing peers in P2VoD can forward the videostream to a new peer as long as they have enough out-bound bandwidth and still hold the first block of thevideo file in the buffer. A big part of P2VoD involvesvarious management tasks; for the sake of clarity, wegroup these management tasks in P2VoD into a proto-col, called control protocol. The control protocolemploys two new ideas, a distributed directory service

and a concept of generation. Peers arriving to the systemwithin a threshold are grouped into a generation. Eachgeneration has a directory, consisting of one or morepeers, to keep track of the current buffer content of eachpeer in that generation. Directory nodes in different gen-erations coordinate with each other to form a distributeddirectory service. The distributed directory service ismaintained by control messages. The control overhead,which is the aggregate of control messages, is kept smallin P2VoD because the exchange of control messages islocalized.

Admission to a generation is temporal-constraint insuch a way that for a peer in a specific generation, it onlyneeds to look in the directories of two generations to findthe suitable forwarding peer. The failure recovery and joinalgorithms exploit the temporal-constraint characteristic ofgenerations to ensure robust failure recovery and quickjoin.

In summary, the main contributions of our paper are:

• A caching scheme to utilize peer resources for handlingasynchronous pattern of peer arrival times.

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

• A control protocol using the distributed directory ser-vice and generations to provide robust failure recoveryand quick join, yet requires only small control overhead.

• A detailed analysis to show that P2VoD is sound andefficient. A performance evaluation is also presented tocompare P2VoD with a recent proposed system calledP2Cast.

This paper is a significant extension of our preliminaryresult reported in our conference paper [6]. First, this paperintroduces the distributed directory service to facilitate thequantification of control overhead in P2VoD. Second, weprovide analytical analysis of the P2VoD system, which islacking in the conference version [6], to show that the pro-posed P2VoD technique is sound and efficient. Addition-ally, more details are provided to clearly explain criticalalgorithms in P2VoD such as the failure recovery and joinalgorithms. Finally, we include more recent works in P2Plive streaming and VoD streaming in the related work ses-sion of the paper.

The rest of paper is organized as follows. Section 2 pre-sents the P2VoD technique. Analytical analysis is presentedin Section 3, in which we analyze P2VoD and compare itagainst P2Cast. Section 4 evaluates P2VoD against P2Castusing simulation-based study. Section 5 discusses therelated work. Finally, we conclude the paper in Section 6.

2. The proposed P2VoD

We discuss the control protocol of P2VoD in Section2.2, and methods to select a forwarding peer for a request-ing peer in Section 2.3. Algorithms to perform a joinrequest and recover disrupted peers from failures are pre-sented in Sections 2.4 and 2.5, respectively.

2.1. Preliminary

In P2VoD, a streaming connection is assumed to be con-stant bit-rate, which equals to the playback rate of thevideo player. We define a retrieval block (R-block) as adata unit of the video, which is also equivalent to one unitof playback time. R-blocks of a video are numbered from 1to the length of the video file according to their temporalposition in the video [7]. There is a server, or a group ofservers, hosting the video. A client who wants to watchthe video is called a peer; the server is treated as a specialpeer. A peer p has the following attributes (pid, t,b), wherepid, t, and b are the unique identification number (such asIP address), the arrival time, and the buffer size of p. A peeralways wants to watch the video from the beginning. Forboost trap purpose, every peer knows the IP address ofthe server. Peers are collaborative; while enforcing collabo-ration in peer-to-peer networks is an important researcharea, it is beyond the scope of this paper.

Fig. 1 shows a snapshot of two video session S1 and S2 inthe P2VoD system at time 60. The starting times of the twosessions are at time unit 0 and 39. Each peer in the figure is

emand streaming in peer-to-peer environments, Comput. Com-

Page 3: Robust video-on-demand streaming in peer-to-peer environments

Fig. 1. A snapshot of the P2VOD system at time 30.

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 3

ARTICLE IN PRESS

represented by a circle, where the peer’s id is written insidethe circle and the peer’s arrival time is written next to the cir-cle. Session S1 has three generations, while session S2 has twogenerations. Peers arriving to the system within a threshold T

are grouped into a generation. In the example, T is equal to10. A peer can forward to another peer if the latter arrives tothe system no later than T time units after the former. Eachpeer has the same buffer size and equals to T.

2.2. Control protocol

As we briefly discussed in Section 1, the control protocolin P2VoD is handling all the peer management tasks. In thesubsequent sections, we first discuss in details the two maincomponents of the control protocol, namely the generationconcept and the distributed directory service. Then wepresent the data structures used by various participantsof P2VoD and the control messages, which are used toensure the desired functionality of the control protocol.

2.2.1. A generation of peer

We observe that when a new peer p arrives to the system,an existing peer p 0 can only serve p if the first R-block ofthe requested video is still in the buffer of p 0. For the easeof later reference, we define the eligibility test as follows.

Definition 1. Given two peers p and p 0. p 0 is eligible to servep if and only if 0 6 (p.t � p 0.t) 6 p 0.b, in which case p 0 issaid to pass the eligibility test of p.

In other words, there is a time window dictating the use-fulness of an existing peer to a newly arriving peer. Twopeers arriving to the system at times far apart from eachother do not need to be aware of each other’s presence inthe system, and any effort to coordinate them is deemedwasted. This localized relationship along the time axisbetween peers inspires us to group peers together for betterpeer management.

Specifically, we introduce generation as a logical conceptfor grouping peers, which arrive to the system closely intime.

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

Definition 2. A generation Gi in a video session SS is acollection of peers pj:

Gi¼fpjjT� ði�1Þ6 ðpj:t�SS:sStartTimeÞ< T� ig; i P 0:

where T is the system-defined threshold, SS.sStartTime isthe starting time of the session, and G0 contains only theserver.

The threshold T is related to the buffer sizes of peers.Without the loss of generality, we assume that every peeruses the same buffer size to cache the incoming videostream, and the buffer size is equal to T. Since every peerhas the same buffer size, we can omit the attribute b in apeer’s representation from now on in the paper.

For every i > 0, Gi�1 is called the parent generation ofGi, and Gi is called the children generation of Gi�1. Thegeneration with the highest index number is called theyoungest generation of the video session. A generation witha higher index number is called younger than a generationwith a lower index number. A video session is a group ofgenerations, which can be defined recursively as follows:

Definition 3. Let SS be a video session in P2VoD

SS ¼ fGijGi Gi�1;Gi�1 2 SSg; i P 0:

Gi ‹ Gi�1 means peers in Gi receive the video stream fromeither Gi or Gi�1 of the video session SS.

2.2.2. Distributed directory service

When a new peer is joining or a disrupted peer is recon-necting to the system, it is important for this requestingpeer to identify quickly which existing peer can be the for-warding peer. P2VoD uses a distributed directory service toindex existing peers for fast look-up of forwarding peers.Designing a distributed directory service for P2VoD repre-sents two challenges: (1) what is the appropriate tradeoffbetween conflicting factors such as update cost, look-upcost, and storage cost, and (2) how do we distribute theworkload evenly among peers?

The first challenge is an issue for every index service, notjust P2VoD’s. This tradeoff means an index service usuallyprovides fast look-up at the expense of higher update andstorage costs. For our specific problem, update cost meansthe non-data messages exchanged in the system, which wediscuss in more details in Section 2.2.4, and storage costmeans the amount of memory used to store control datastructures as shown in Section 2.2.3. To address this firstchallenge, the design principal we use is to focus on fastlook-up. In Section 3, we will attempt to analyze the costof these conflicting factors.

The second challenge is specific to distributed indices.Distributing the workload evenly among peers efficientlyis a difficult task, if not impossible. We utilize a conceptsimilar to the cluster head concept in ZIGZAG [1]. Eachgeneration has a directory head, responsible for indexingpeers in that generation. The server and the directory heads

emand streaming in peer-to-peer environments, Comput. Com-

Page 4: Robust video-on-demand streaming in peer-to-peer environments

4 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

of all generations in the system form the backbone of thedistributed directory service.

In the rest of this section, we describe the component ofthe directory service in each generation, followed by a dis-cussion on how the directory service component in eachgeneration is connected to each other to form the distrib-uted directory service.

Each generation has a directory to index peers in thatgeneration. The directory is made of one ore more peers,called directory peers, which are also members of the corre-sponding generation. The index content is replicated inevery directory peer. One directory peer is designated asthe directory head of the directory. When the very first peerjoins the generation, that peer is also made the directoryhead. Over time, the directory head may recruit other peersto act as directory peers. When a peer p joins the genera-tion, it reports its attributes to the directory head, namely(pid, t). The directory head periodically sends the updatedindex content to other directory peers. The directory headrole can be rotated among directory peers.

Theorem 4. Given a peer p in a generation Gn (n > 0) in a

session SS, all eligible peers of p in this video session are

either in generation Gn�1 or Gn.

Proof. Without loss of generality, assume SS.sStart-Time = 0. Since p is a member of Gn, by definition of a gen-eration, T * (n � 1) 6 p.t < T * n. There are two cases:

• Case 1: If n > 1, a peer p 0 passes the eligibility test of p if0 6 (p.t � p 0.t) 6 T; hence T * (n � 1) � T 6 p 0.t < T * n

or T * (n � 2) 6 p 0.t < T * n. By definition of a genera-tion, p 0 is either with generation Gn�1 or Gn.

• Case 2: If n = 1, p is in generation G1; hence,0 6 p 0.t < T. A peer p 0 passes the eligibility test of p if0 6 (p.t � p 0.t) 6 T; or �T 6 p 0.t < T. Therefore, p 0 iseither the server itself (a sole member of G0), or a peerin G1. h

From Theorem 4, when a peer in generation Gn searchesfor a forwarding peer, it only needs to look into the direc-tories of Gn and Gn�1. For a Gn (n P 1), its directory onlyneeds to be connected to directories of Gn+1 and Gn�1.

2.2.3. Data structuresThis section presents the main data structures used by

the server and peers to support the control protocol.The server has a Server Session Table (SST), and a

Server Directory Table (SDT). The SST stores the sessioninformation, and has the following table schema: (sid,sStartTime), where sid and startTime are the unique identi-fication number and starting time of a session. The SDT

stores the directory service information of each session,and has the following table schema (gid, sid, pid, gStart-

Time), where sid is the session id, gid is the generation idin that session, pid is the id of the directory head of the gen-eration’s directory, and startTime is the starting time of thegeneration.

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

Each peer has a Peer Directory Table (PDT) to keep theinformation of directory peers of the same generation.PDT has the following schema: (pid, isDirectoryHead),where pid is the id of a directory peer, and the boolean var-iable isDirectoryHead is set to true if that directory peer isalso the directory head.

If a peer is also a directory peer, it has the followingadditional tables: Peer Attribute Table (PAT), and Peer

Neighbor Table (PNT). PAT stores attributes of peers inthe same generation, and has the following schema:(pid, t), where pid and t are the id and starting time of apeer. PNT stores the directory information of the neigh-bor generations, namely the parent generation and thechildren generation of the current generation, and hasthe following schema: (pid, isParent, isChildren), wherepid is the id of the directory head of the neighbor gener-ation, isParent is 1 if the neighbor generation is the par-ent, and isChildren is 1 if the neighbor generation is thechildren.

Similar to a peer’s attributes, we also define attributesfor a generation G and a video session SS for easy referencein later discussions. A generation G has the following attri-butes (sid, gid, dirHead, gStartTime), where sid is the ses-sion id, gid is the generation id, dirHead is the directoryhead of the generation, and gStartTime is the starting timeof the generation. Note that many times the gid of the gen-eration G is indicated in the subscript as Ggid. A video ses-sion has the following attributes (sid, sStartTime), wheresid is the session id, and sStartTime is the starting time ofthe session.

2.2.4. Control messages

Control messages are exchanged to maintain properfunctionality of the control protocol. In this section, wedescribe the exchange of control messages between peersin P2VoD when there is no peer joining or leavingP2VoD. In Sections 2.4 and 2.5, we will show how the con-trol messages are used when a peer is joining or leavingP2VoD.

In a directory, the role of the directory head can berotated among current directory peers. The purpose ofthe rotation is to balance the workload among directorypeers, since a directory head is handling more communi-cations with other entities than other directory peersare. A simple rotation scheme is to use Round Robin,while more elaborate schemes are also possible. We donot discuss further on the selection of directory head,since existing literature on load balancing already offersmany selection methods. When a new directory head isselected, this directory head sends the server aUPD_SDT_MSG including (gid, sid, pid), which are thegeneration id, session id, and peer id of the new directoryhead. The new directory head also broadcasts to all peersin the generation a UPD_PDT_MSG, and sends aUPD_PNT_MSG to directory heads of neighbor genera-tions found in the directory head’s PNT. Upon receiptof the message, the server uses (gid, sid) in the message

emand streaming in peer-to-peer environments, Comput. Com-

Page 5: Robust video-on-demand streaming in peer-to-peer environments

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 5

ARTICLE IN PRESS

to update pid of the matching entry in table SDT accord-ingly. When a peer receives the broadcast messageUPD_PDT_MSG, it updates the PDT by setting the flagisDirectoryHead properly for the old and new directoryheads. Upon the receipt of the message UPD_PNT_MSG,directory heads of the parent and children generationsupdate their PNT accordingly.

Within a directory, the directory head also periodi-cally sends other directory peers the updated messagesSYN_PAT_MSG, and SYN_PNT_MSG to keep theirindex contents in PAT, and PNT in sync. SYN_PAT_MSG contains the difference between the directoryhead’s current PAT and previous PAT from the last syncattempt. Upon receipt of the message, a directory peerupdates its PAT accordingly. The process for updatingPNT is similar.

When a peer no longer wants to serve as a directorypeer, it can request the directory head to release it fromthe responsibility. The directory head in turn looks for areplacement from non-directory peers in the generation.If a replacement is found, the directory head transfersthe contents of PAT and PNT to the new directory peer,and notifies the requesting directory peer that it canleave.

2.3. Selecting forwarding peer

If a peer p has multiple peers that pass the eligibility testin Definition 1 of p, how does P2VoD select a peer p 0

among these eligible peers as the forwarding peer of p? Thissection discusses a number of methods for selecting p 0. Wedenote receiving peers as receivers, and forwarding peers asforwarders.

We assume that the requesting peer has enoughinbound bandwidth to receive the video stream. VoDservice has a stringent bandwidth requirement but is rel-atively insensitive to the delay [8]. Therefore, we preferto select a peer with abundant unused bandwidth tobe the forwarding peer of the requesting peer. The stop-ping condition for the peer selection process can be oneof the followings: (1) an eligible peer with enoughunused bandwidth has been found, or (2) all eligiblepeers have been considered, and the requesting peereither selects the best eligible peer or does not selectany peer at all.

The naive method to select a forwarding peer is to let therequesting peer probe all eligible peers, until the stoppingcondition has been met. This probing technique has severaldrawbacks [9]: (1) when many large-scale services usingtheir own probing techniques, this step becomes redundant,and prevents new application from leveraging informationalready gathered by other applications, and (2) implement-ing this probing technique is often quite subtle, becauselarge-scale probing of end-hosts can raise intrusion alarmsin edge networks as the traffic can resemble a DDOSattack.

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

Several research efforts have attempted to provide acommon measurement infrastructure for distributed appli-cations. In this paper, we adopt the iPlane (InformationPlane) technique in [9]. Different from other existing meth-ods, iPlane can provide pair-wise bandwidth measurementbetween end-hosts. iPlane provides a user interface, inwhich given the ids of two end-hosts, iPlane can outputthe estimated available bandwidth of the path. Dependingon the stopping condition, iPlane can (1) compute theavailable bandwidth between the requesting peer and aneligible peer sequentially until a satisfactory path is found,or (2) compute the available bandwidth between therequesting peer and every eligible peer.

Algorithm 1. FindForwardPeer(C,p)

Input: The requesting peer p, the set of eligible peers C,and the stopping condition

Output: Return a peer p 0 in C that meet the stoppingcondition, or NULL if there does not exist such peer

1: p sends iPlane the list of eligible peers C

2: iPlane returns p 0 or NULL

2.4. Join algorithm

When a peer wishes to join P2VoD, the following proce-dure is carried out.

Algorithm 2. Joining Algorithm in P2VoD

Input: A peer p wants to join P2VoDOutput: Connect p to P2VoD or reject the joining request

from p {the server S carries the following steps when p

contacts S}1: initialize connectFlag to 0

2: while there are more entries in SST AND connectFlag isFALSE do

3: let Sm be the video session in the current entry in SST

4: let Gn be the youngest generation of Sm (i.e., bylooking up in SDT)

5: if p.t < (Gn.gStartTime + T) then

6: connectFlag ‹ JoinExistingGeneration(Gn.dirHead,p)

7: else if (Gn.gStartTime + T) 6 p.t < (Gn.gStart-

Time + 2T) then

8: connectFlag ‹ CreateNewGeneration(Gn.dirHead,p)

9: end if

10: end while

11: if connectFlag is FALSE then12: connectFlag ‹ CreateNewSession(p)13: end if

14: if connectFlag is FALSE then

15: the server rejects the joining request from p

16: end if

emand streaming in peer-to-peer environments, Comput. Com-

Page 6: Robust video-on-demand streaming in peer-to-peer environments

munications xxx (2007) xxx–xxx

ARTICLE IN PRESS

Theorem 5. A session cannot accept a new peer at time t if

gStartTime + 2T < t, where gStartTime is the start time of

the youngest generation of the session. Such session is called

a close session. A session is called an open session if the

above inequality does not hold for its youngest generation.

6 T.T. Do et al. / Computer Com

Proof. From the Definition 2, if a peer p 0 is in the youn-gest generation, its arrival time must satisfy the conditionp 0.t 6 (gStartTime + T). Since the buffer size of p 0 is T,p 0 passes the eligibility test of a new peer p ifp.t 6 p 0.t + T. Therefore, if a new peer can find an eligi-ble peer in the youngest generation, the followinginequality must hold p.t 6 p 0.t + T 6 (gStartTime + 2T).In other words, a new peer p cannot find any eligiblepeer in the youngest generation of the session if gStart-

Time + 2T < p.t. Since the youngest generation of a ses-sion contains the latest arriving peers of the session,there will be no peer in the whole session that can passthe eligibility test of p if gStartTime + 2T < p.t; thisimplies that the session cannot accept a new peer at timet if gStartTime + 2T < t. h

The server S first looks for an open session, that the newpeer p can possibly join (step 2 to step 10). If none of theexisting sessions can accommodate p, S will try to createa new session for p to join as shown in step 11 to step13. Finally, if p is still not connected to P2VoD, S has toreject p since both the server and existing peers can notaccommodate p at the moment.

There are two cases that peer p can join an existingsession:

• Case 1: p joins the youngest generation of the session(step 6) according to Algorithm 3, if p satisfies the mem-bership requirement of the generation (step 5).

• Case 2: p tries to join a newly created generation ofthe session (step 8) according to Algorithm 4, if it isstill possible that there exists a peer in the youngestgeneration of the session to pass the eligibility testof p (step 7).

In Algorithm 3, the directory head dirHead of the gener-ation Gn attempts to take p into this generation, given thatp satisfies the membership requirement of Gn. The directoryhead first looks for existing peers in either Gn or the parentgeneration of Gn, which can pass the eligibility test of p

(from step 4 to step 8). Among the eligible candidates, p

looks for the forwarding peer by calling Algorithm 1. If p

is successfully connected to P2VoD, dirHead updates itsindex table PAT to include the information about p (steps12–14).

In Algorithm 4, the directory head attempts to createa new generation for the joining peer p, given that p

can not be in the same generation Gn with dirHead.First, dirHead checks if any peer in Gn passes the eligi-bility test of p (steps 3–7). Among the eligible candi-dates, p looks for the forwarding peer by calling

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

Algorithm 1. If p finds a forwarding peer, p becomesthe directory head of the new generation, Gn+1, andproceeds to update its data structures and the server’sdata structures (steps 11–15).

Algorithm 3. JoinExistingGeneration(dirHead,p)

Input: The directory head dirHead and joining peer p

Output: Return 1 (or 0) if p is (or is not) admitted to thecurrent generation {the following steps are executed atthe directory head dirHead}

1: initialize isConnectedFlag to 0

2: C ‹ ;3: let pDirHead the directory head of the parent generation

(i.e., using dirHead.PNT)4: for each peer pi from dirHead.PAT and pDirHead.PAT

do5: if p.t < pi.t + T then

6: C ‹ pi

7: end if

8: end for

9: if C „ ; then

10: isConnectedFlag ‹ FindForwardPeer(C,p)11: end if

12: if isConnectedFlag is TRUE then13: dirHead updates its PAT to include p in the index14: end if

15: return isConnectedFlag

Algorithm 4. CreateNewGeneration(dirHead,p)

Input: The directory head dirHead and joining peerp

Output: Create a new generation, whose first member is p.Return 1 (or 0) if the generation is (or is not) created{the following steps are executed by the directory headdirHead}

1: initialize isConnectedFlag to 0

2: C ‹ ;3: for each peer pi from dirHead.PAT do

4: if p.t < pi.t + T then5: C ‹ pi

6: end if

7: end for

8: if C „ ; then

9: isConnectedFlag ‹ FindForwardPeer(C,p)10: end if

11: if isConnectedFlag then

12: p becomes the directory head of the newgeneration

13: p and dirHead update their data structures (PNT

for dirHead; PDT, PAT and PNT for p)14: S updates its server-side data structures (SST)

based on messages sent by dirHead

15: end if

return isConnectedFlag

emand streaming in peer-to-peer environments, Comput. Com-

Page 7: Robust video-on-demand streaming in peer-to-peer environments

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 7

ARTICLE IN PRESS

Algorithm 5. CreateNewSession(p)

Input: A joining peer p

Output: Create a new video session for p. Return 1 (or 0)if p is allowed (or rejected) to join P2VoD {the follow-ing steps are executed by the server S}

1: if server S is still capable to serve p then

2: p is connected to S

3: p becomes the directory head of the new generation G1

of the new session4: S updates its data structures, SST and SDT

5: p updates its data structures, PDT, PAT and PNT

6: return TRUE7: end if

return FALSE

In Algorithm 5, when none of the existing sessionscan accommodate p, the server attempts to create anew session and let p join it. Steps 1–7 show the casewhen the available bandwidth at S is still enough tosupport p. In this case, p is connected to S, andbecomes the directory head of the first generation ofthe newly created session. S and p also update theirdata structures accordingly.

2.5. Failure recovery algorithm

In a peer-to-peer environment like P2VoD, failures areexpected to happen often due to the unpredictable behaviorof users as well as the congested traffic in the underlyingnetwork. P2VoD uses a two-phase failure recovery proto-col. The first phase requires a peer to detect a failurebetween it and the forwarding peer. Then the abandonedpeer enters the second phase to recovery from the failureby finding a new forwarding peer. We discuss these phasesin the following sections.

2.5.1. Detecting failures

Failures in P2VoD are categorized into two kinds:graceful and unexpected. Graceful failure refers to thecase when a peer intentionally chooses to leave the sys-tem. On the other hand, unexpected failure accounts fornetwork failure and peer crash (e.g., buffer overflow).These two kinds of failures require different detectionmechanisms.

• Graceful failures: When a peer decides to leave thesystem, it informs its receivers by sending a LEAVE

message. These abandoned peers will then invokethe failure recovery algorithm in Section 2.5.2 to finda new forwarder. When a failure happens, P2VoDalso needs to update its control data structuresdepending on the role of the leaving peer in itsgeneration:

Case 1: If the leaving peer is a non-directory peer,the leaving peer informs the directory head of its

Please cite this article in press as: T.T. Do et al., Robust video-on-demmun. (2007), doi:10.1016/j.comcom.2007.08.024

intention, and the directory head updates its PAT

accordingly.Case 2: If the leaving peer is a directory peer, but not adirectory head, the directory head has to: (1) updatesits PAT, PDT, and (2) promotes a peer to directorypeer using a similar process as shown in Section 2.2.4when a directory peer wants to become a non-directorypeer.Case 3: If the leaving peer is a directory head, the leavingdirectory head selects a directory peer to become thenew directory head. In addition to steps in Case 2, thenew directory peer informs the server and neighbor gen-erations of the change.

• Unexpected failures: Since failures in this category hap-pen unexpectedly without any explicit warning, peers inP2VoD are required to monitor their incoming trafficconstantly in order to detect failures. Three messagesare used: TEST_ERROR message, NET-

WORK_RECOVER message, and WAIT message. Ifthe quality of the video stream received is below athreshold, a peer sends a TEST_ERROR message toits forwarder. If a peer does not receive any responsefrom its forwarder for three consecutive TEST_ERROR

messages, then the connection is deemed as a failure. Incase the forwarder can receive the TEST_ERROR mes-sage, there are two possibilities. If the forwarder hasalready used up all available outbound bandwidth, itsends a NETWORK_RECOVER message suggestingthat receiver to initiate the failure recovery procedurein Section 2.5.2 to connect to a better forwarder. Thesecond possibility is that the forwarder itself is involvingin another recovery process, it sends a WAIT messageasking the receiver to wait until the recovery is finished.Note that when the connection is deemed as a failure,the receiver will inform the directory head or one ofthe directory peers (in case the directory head is the for-warder) to update the control data structures as shownin the case of graceful failures.

2.5.2. Recovering from failures

When there is a failure at a peer p of a generation G,the whole sub-tree under p is affected. p initiates therecovery process, while the rest of the sub-tree areinformed to wait through the use of the WAIT message.If p succeeds, the whole sub-tree is recovered. If p fails tofind a new parent, then p is rejected. The receivers of p

will invoke the recovery process to try to recover theirown sub-tree. The process repeats recursively throughthe whole sub-tree. A disrupted peer p uses the followingsteps to find a new forwarder for itself.

• Step 1: Using its PDT, p contacts the directory head ofits generation, dirHead. The directory head will attemptto reconnect p by running a procedure similar to theAlgorithm 3. If p is reconnected, the recovery stops, elsep follows the following second step.

and streaming in peer-to-peer environments, Comput. Com-

Page 8: Robust video-on-demand streaming in peer-to-peer environments

8 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

• Step 2: p contacts the server S. S will run a proceduresimilar to the Algorithm 5 to reconnect p. If p is recon-nected, the recovery stops, else p will be ejected from thesystem. Note that these special sessions do not allownewly arriving peers to join as regular sessions.

3. Analytical analysis

3.1. Network capacity amplification

The network bandwidth in a peer-to-peer VoD systemavailable to serve new peers includes both that of the serverand of other peers. We say that the network capacity of thetraditional server–client system has been amplified in thepeer-to-peer system. To facilitate the comparison in termof capacity amplification between P2VoD and P2Cast [8],we assume that both systems have the same system thresh-old T. In other words, peers in P2VoD and P2Cast use thesame buffer size T to cache the video stream.

Theorem 6. A new peer p in P2VoD can select its forwarding

peer from a set, which consists of the server and all

peers, whose arrival times are no more T time units earlier

than p’s.

Proof. The theorem can be proven by tracing the join algo-rithm in Section 2.4. According to the join algorithm, thereare two cases when p joins P2VoD:

• Case 1: if p joins a new session, the server will be the for-warding peer for p.

• Case 2: if p joins an existing session, p will join either thecurrently youngest generation or a new generation. Asalready shown in Theorem 4, through the directory headof the currently youngest generation, all peers that passthe eligibility test of p are available for p to select its for-warding peer. In other words, all peers, whose arrivaltimes are no more T time units earlier than p’s, are par-ticipants in the forwarder selection for p. h

Fig. 2. A snapshot of P2VoD at time 9.

Fig. 3. A snapshot of P2Cast at time 9.

From Theorem 6, we conclude that given a threshold T,P2VoD has maximized the capacity amplification of thesystem by fully utilizing all eligible peers.

On the contrary, there exist instances in P2Cast in whicha new peer can not select an existing peer to be the newpeer’s forwarder, even though the existing peer’s arrivaltime is also no more T time units earlier than the newpeer’s. We prove this statement about P2Cast by usingan example, which is shown below.

Assume T = 5, and there are five peers in the two sys-tems when we capture the systems’ snapshots at time 9.The arrival times of the five peers are as follows: p1.t = 0,p2.t = 2, p3.t = 4, p4.t = 6, p5.t = 8. In Fig. 2, the serveronly needs to forward the stream to one peer in P2VoD,while in Fig. 3 the server has to forward the base streamto two peers. When p4 arrives to P2VoD at time 6, peersp2 and p3 still have the first video segment in their buffers,

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

and p2 is picked to forward the stream to p4. On the otherhand, when p4 arrives to P2Cast at time 6, p4 can not jointhe same video session with p1, p2, and p3, becausep3.t � p1.t = 6 > T. As a result, the server has to streamthe video to p4. This example shows that both P2VoDand P2Cast utilize peers’ network capacity to amplify thetraditional server–client system, but P2VoD maximizespeers’ potential while P2Cast does not.

3.2. Join overhead

The join overhead indicates the delay it takes before anew peer p can join P2VoD. The main contributing factorto the join overhead is the time it takes for p to set up acommunication session with an existing peer. To measurethe join overhead, we measure the number of peers p hasto contact during the joining process.

We make an assumption that if p just needs to estimate thenetwork delay between it and an existing peer, p can query acentralized service as shown in Section 2.3. Since p does notneed to contact the existing peer directly, the network delaymeasurement does not constitute a peer contact.

From the join algorithm in Section 2.4, there are twocases:

• Case 1: If all sessions are closed, the server will proceedto create a new session and make p the first member ofthe first generation of the new session. In this case, thenumber of peers p has to contact is 0, excluding theserver.

emand streaming in peer-to-peer environments, Comput. Com-

Page 9: Robust video-on-demand streaming in peer-to-peer environments

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 9

ARTICLE IN PRESS

• Case 2: If there are some open sessions, p contactseach open session sequentially until p is connected toP2VoD. For each open session, p only needs to com-municate with the directory head of the youngest gen-eration of that session; the number of peers p contactsin an open session is 1. The next question is what isthe maximum number of open sessions p has to con-tact? To answer this question, we assume that (1)every peer in P2VoD can forward the stream to atleast another peer, and (2) new peers join the systemsequentially. The first assumption means that eventhough a new peer consumes some network bandwidthof the session, it also contributes enough networkbandwidth to at least offset the consumption. In otherwords, the joining of a new peer to a video sessiondoes not decrease the overall network capacity ofthe session. The second assumption ensures that thesession is never temporarily overwhelmed by concur-rent join requests.

Lemma 7. The start times of two generations in one session

is different by at least T time units.

Proof. The result of this lemma comes directly from thedefinition of a generation as shown in Definition 2. h

Using Lemma 7 and the above assumptions, we have:

Lemma 8. The start times of the two youngest generations

of any two sessions are different by more than T time

units.

Proof. A new session is created when none of the exist-ing sessions can accommodate a new peer p. This meanseither (1) none of the existing sessions have enoughunused bandwidth to serve the new peer or (2) there isnot any peer in any session that passes the eligibility testof the new peer, or for every peer p 0 in the systemp 0.t + T < p.t. From the above assumptions, (1) cannothappen, so (2) is the case. For an existing session wehave gStartTime_Current 6 p 0.t, where gStartTime_Cur-

rent is the start time of the youngest generation and p 0

is a peer in that generation. Similarly for the new ses-sion, we have gStartTime_New 6 p.t, where gStart-

Time_New is the start time of the youngest generationof the new session and p is a peer in that generation.It follows that gStartTime_Current + T < gStart-

Time_New. We conclude that the start times of the twoyoungest generations of any two sessions are differentby more than T time units. h

Theorem 9. The maximum number of open sessions in

P2VoD at any time will be 2.

Proof. At an arbitrary time t, there are the following casesin P2VoD. If there exists some open sessions in P2VoD, thegStartTime of the youngest generation of these sessionsmust satisfy the following inequality t � 2T 6 gStart-

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

Time 6 t. From Lemma 8, there can be no more two ses-sions, whose start time of the youngest generationsatisfies the above inequality. Therefore, the maximumnumber of open sessions in P2VoD at any time will be2. h

From Theorem 9, we conclude that the maximum num-ber of sessions a new peer p has to contact during a join is2, or the number of peers p has to contact during a join is 2.

3.3. Failure recovery overhead

Similar to the join overhead, failure recovery over-head measures the delay it takes for a disconnected peerto reconnect to P2VoD under a failure. To measure thefailure recovery overhead, we also use the number ofpeers the disconnected peer has to contact during arecovery.

From the failure recovery algorithm in Section 2.5, adisconnected peer only needs to contact the directory headof its generation to look for a new forwarding peer. If theattempt is unsuccessful, the server streams the video to thedisconnected member using a special recovery channel. Thenumber of peers the disconnected peer has to contact dur-ing a recovery is 1.

3.4. Control overhead

The control overhead accounts for non-data messages inP2VoD between two peers to maintain the control protocoldescribed in Section 2.2. From Sections 2.2.4, 2.4, and 2.5,we distinguish two kinds of control messages: the first kindis to maintain the integrity of the control protocol, and thesecond kind is to improve the robustness of the controlprotocol.

The first kind of control messages comes from the joinand recovery procedures 2.4, and 2.5. For each join orfailure recovery, the maximum number of peers has toupdate their control data structures are: the server, thedirectory head, and the directory head of the neighborgenerations.

The second kind of control message comes from thecommunication between the directory head and otherdirectory peers in a generation 2.2.4. In each event, themaximum number of peers a directory head has to contactare: the server, the directory peers in the same generation,and directory heads of the neighbor generations.

We conclude that the control overhead in P2VoD issmall, since for each event, the number of control messagesexchanged is small.

4. Simulation study

The analytical analysis in Section 3 has proven that thedesign of P2VoD is sound and efficient. Specifically,P2VoD maximizes peers’ potential, while P2Cast doesnot. Moreover, under certain assumptions, we show that

emand streaming in peer-to-peer environments, Comput. Com-

Page 10: Robust video-on-demand streaming in peer-to-peer environments

0

0.1

0.2

0.3

0.4

0.5

0.6

0 500 1000 1500 2000 2500

Normalized Workload

T= 0.10T= 0.20T = 0.3T = 0.4

Rej

ectio

n Pr

obab

ility

Fig. 4. Rejection probability for different values of T.

10 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

the join and failure recovery overheads are small. Addi-tionally, P2VoD achieves these desirable characteristicswith small control overhead.

In this section, we use simulation method to evaluate theperformance of P2VoD under more realistic assumptions.In particular, we remove the assumptions in Section 3.2to study the join and failure recovery overheads. We alsodo not utilize the iPlane service to look for forwardingpeers as shown in Section 2.3. This means we use the naiveprobing method to look for a suitable forwarder, and thisprobing is counted as a peer contact in join and failureoverheads. The reason is we want to give a fair comparisonwith P2Cast, since P2Cast does not have a similar distrib-uted directory service.

Following our analysis in Section 3, we also use the fol-lowing performance criteria in our simulation:

• Network Capacity Amplification: can be measured withtwo parameters, the Server Stress and the Rejection

Probability. The Server Stress measures the number ofdirect streams currently at the server. The RejectionProbability means the probability that a peer tries tojoin the system but can not get the service. These twoparameters are useful in two different situations. Whenthe server’s network capacity is not saturated yet, thelower the server stress value is, the better the system uti-lizes peers’ network capacity. When the server’s networkcapacity is saturated, the lower the rejection probabilityis, the better the system utilizes peers’ network capacity.In our simulation setting, the number of peers far out-weighs the server’s network capacity; hence, we onlyuse Rejection Probability as the measurement of net-work capacity amplification.

• Join Overhead: is measured using the number of peercontacts during a join.

• Failure Overhead: is measured using the number of peercontacts during a failure recovery.

In our simulation, the underlying network topology iscreated using GT-ITM [10]. The whole network consistsof one transit network (with 4 nodes), and 12 stub domains(with 96 nodes in total). We assume that each node in thisnetwork represents a local area network with the ability tohost unlimited number of peers, and to have enough band-width to support media streaming. Routing between twonodes is determined by using the shortest path algorithm.We assign bandwidth capacity to links in the network asfollows: a backbone link (at least one end point of the linkis a transit node) can support 25 concurrent media streams,and an edge link (any link in the network, which is not abackbone link) can support seven concurrent mediastreams. The video server is placed at a transit node. Peersarriving to the system follow the Poisson distribution, andare randomly placed into one of the network nodes. Thereis only one video available at the server with length of 2 h.Each simulation run also lasts for 2 h. The total number ofpeers arriving to the system during a simulation run is

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

determined as (Arrival Rate * Length of the video), callednormalized workload.

We first study the sensitivity of the threshold T toP2VoD with respect to network capacity amplification.Fig. 4 shows the rejection probability as a function of T

and the normalized workload when there are no peer fail-ures. The value of T is varied from 0.1 to 0.4 of the lengthof the video. With a fix value of T, the rejection probabilityincreases with the increasing size of the normalized work-load. This is due to the limited available bandwidth ofthe underlying network. When there are already a largenumber of peers in the system, the probability that a newpeer can find a path with sufficient bandwidth to get thestream from one of eligible peers is smaller. A more inter-esting result in here is that the rejection probability can besubstantially reduced by increasing the size of T. Whendoubling the size of T (from 0.1 to 0.2, and 0.2 to 0.4),the rejection probability is reduced by half. The latter resultcan be explained as follows. When T increases, it impliesthat a peer can stay longer as an eligible peer to new com-ing peers. As a result, a new peer can find more eligiblepeers to get the video stream, which increases the chancethe new peer finds its forwarding peer with sufficientunused bandwidth. Since the rejection probability is a goodindicator of how scalable a system is, to some extent T canbe used to adjust the system to a desirable scalability.

We also compare P2VoD against two variants of P2Cast[8]: the BF-delay-appro and BF protocols. These two vari-ants of P2Cast are chosen because, as reported in thepaper, BF has the best performance in server stress, whileBF-delay-appro achieves lowest rejection probability. Wereport on the case when both P2VoD and P2Cast useT = 0.2.

In Fig. 5, P2VoD outperforms BF-delay-appro and BFin term of rejection probability, especially when the work-load increases. One possible explanation is that P2Castvariants require each peer to obtain two streams, a patchand a base stream, when joining the system, whileP2VoD only needs to allocate one single stream for anew coming peer. When the workload increases (i.e., thearrival rate increases), a significant amount of networkbandwidth allocated for patch streams in P2Cast system

emand streaming in peer-to-peer environments, Comput. Com-

Page 11: Robust video-on-demand streaming in peer-to-peer environments

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 500 1000 1500 2000 2500Normalized Workload

Rej

ectio

n Pr

obab

ility

P2VoD

BF

BF-delay-appro

Fig. 5. P2VoD vs. P2Cast: peer rejection probability.

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 11

ARTICLE IN PRESS

are not released when new peers join, resulting in higherrejection probability. The second reason is, as shown inSection 3.1, there are more eligible peers for a new peerto choose from in P2VoD than in P2Cast. As a result, anew peer in P2VoD has a better chance of finding a for-warding peer with sufficient unused bandwidth.

Fig. 6 illustrates that when joining the system, on aver-age a new peer in P2VoD has to contact less number ofexisting peers than in P2Cast. Between two variants ofP2Cast, the reason that a new peer in BF-delay-appro con-tacts less number of existing peers than in BF is obvious. InBF-delay-appro, a new peer only needs to find an existingclient with enough bandwidth to support the stream, whilein BF a new peer has to look for the ‘‘fattest pipe’’ [8] to getthe stream from. In spite of that, our P2Cast still performsbetter than BF-delay-appro due to several reasons. Firstly,a new peer in P2VoD only needs to contact existing peers inthe youngest generation and the parent generation of that,while a new peer in BF-delay-appro may have to contactevery existing peers in the video session it wants to join.Secondly, a new peer in P2VoD only looks for one existingpeer to get the stream from, while a new peer in BF-delay-appro needs to look for two existing peers to get two sep-arate streams, the patch and the base stream.

To simulate peer failures, we first allow 3516 peers tojoin each system, which is equivalent to having peers arriveto the systems at a rate of 0.4 per second during a period of2 h. After the joining process is finished, we assign a failureprobability to each peer, and then observe how affectedpeers recover from failures. Fig. 7 shows that the number

0

10

20

30

40

50

60

70

0 500 1000 1500 2000 2500 3000

Normalzied Workload

Ave

rage

num

ber o

f nod

es

cont

acte

d fo

r a jo

in

P2VoD

BF

BF-delay-appro

Fig. 6. P2VoD vs. P2Cast: join overhead.

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

of peers an affected peer has to contact during a recoveryin P2Cast only equals to one half of that in BF and twothird of that in BF-delay-appro. The reason for the betterperformance of P2VoD can be explained similarly to theprevious case of join overhead. When the failure probabil-ity moderately increases (from 0.05 to 0.2), the number ofpeers contacted during a recovery decreases. This is per-fectly normal because when more peers leave the system,network bandwidth is also released; hence, this makes eas-ier for an affected peer to find a new forwarder to get thestream from.

5. Related works

Providing VoD on the Internet has been a research topicfor some time. There have been a large amount of researchpapers on this topic. In this section, we do not intend to doan exhaustive review of the literature. We only manage togive the readers a brief overview of the research in this field.Works considered closely related to our paper are discussedin greater length.

IP Multicast as an extension of the Internet is designedfor one-to-many and many-to-many communications atthe network layer, and as such can potentially solve thenetwork bottleneck problem at the video server. A numer-ous number of VoD techniques assuming IP multicast havebeen proposed. They can be categorized as periodic broad-cast, and batching. Periodic broadcast is proposed fordelivering popular video, and examples of this categoryinclude staggered broadcasting, Skyscraper, Client-CentricApproach, Stripping, Cautious Harmonic, and Pagoda[11]. A limitation of periodic broadcast is its non-zerostartup delay. For less popular video, it is more efficientto multicast the requested video-on-demand instead ofbroadcasting it repeatedly [11]. However, in IP multicastclients are assumed synchronous, while in reality clientrequests arrive at different times, indicating their asynchro-nous nature. To solve this problem, batching proposes togroup together multiple requests, within a short period oftime, and serve them using the same multicast session.One problem with batching is the long startup delay forearly clients in the batch. Patching is then proposed toaddress the startup delay problem in order to provide vid-

0

10

20

30

40

50

60

70

0 0.05 0.1 0.15 0.2 0.25Failure Probability

P2VoD

BF

BF-delay-appro

Ave

rage

num

ber o

f clie

nts

cont

acte

d fo

r a re

cove

ry

Fig. 7. P2VoD vs. P2Cast: failure overhead.

emand streaming in peer-to-peer environments, Comput. Com-

Page 12: Robust video-on-demand streaming in peer-to-peer environments

12 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

eos truly on-demand [11]. Patching allows a late client tojoin an ongoing multicast session and still receives thevideo in full. The client gets the initial missing part of thevideo (a patch) from the server through an IP unicast con-nection. A problem with patching is that the server can beoverwhelmed with patching requests from clients. Besidesthe aforementioned limitations of IP Multicast-based tech-niques, the deployment of IP Multicast itself on the Inter-net has been very slow due to several fundamental issues[12,13].

To cope with the infeasibility problem of IP Multicast,researchers have proposed multicast services at the applica-tion layer, assuming only IP unicast at the network layer.Proposals for application layer multicast can be classifiedas infrastructure-based approach and peer-to-peerapproach.

In the infrastructure-based approach [14–17], a set ofdedicated overlay routers are used to act as software rou-ters with multicast functionality. The content is then trans-mitted from the source to a set of receivers on a multicasttree consisting of overlay routers. A new receiver can joinan existing media stream by connecting to an overlay rou-ter, who belongs to a delivery path to an existing receiver.This approach can alleviate the bottleneck problem at thesource because a client can get services not only from thesource but also from overlay routers. A drawback of thisapproach is the deployment/maintenance cost for thoseoverlay routers.

To avoid the cost of deploying and maintaining addi-tional hardware, peer-to-peer approach does not use over-lay routers. The multicast tree is built containing only thesource and the receivers. Existing techniques using P2Papproach for media streaming can also be categorized aslive streaming and VoD streaming. Live streaming usingP2P has been studied quite intensively in many papers[1,3,2,12,4]. However, these proposals are not readily appli-cable to P2P VoD streaming due to reasons listed in theintroduction section of the paper. Since our P2VoDbelongs to the latter category, VoD streaming, we discussin more detail the existing papers in this category[7,8,4,18,19].

To the best of our knowledge, Chaining [7] is the firstpaper applying P2P concept in VoD streaming. Each clientin Chaining has a fixed-size buffer to cache the most recentcontent of the video stream it receives. A new client canreceive the video stream from an early client as long asthe first data block of the video is still in the buffer of thelatter. Chaining does not provide a recovery protocol incase failures happen.

Using a similar interval caching scheme like Chaining,the data delivery path in DirectStream [19] forms a treeinstead of a chain, so that outbound-bandwidth constraintof clients can be considered. The failure recovery protocolin DirectStream is similar to P2Cast [8], and its limitationswill be discussed shortly when we look at that scheme.Another drawback of DirectStream is the deployment ofthe directory server to look for potential candidates when

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

a new client joins the system. This centralized approachpresents a single point of failure.

CoopNet [4] employs a multi-description coding methodfor media content, and is intended to support both live andVoD streaming. In CoopNet, the media signal is decodedinto several streams, or so called descriptions, in whichevery subset of them is decodable. CoopNet then buildsmultiple distribution trees spanning the source and thereceivers, and each tree is used to transmit a single descrip-tion. Therefore, a peer failure only causes the affected peersto loose some descriptions. This scheme works fine for livestreaming case. For on-demand streaming, when the serverreceives such a request and is already overloaded, therequest is directed to some client, who has downloadedthe requested stream (in full or partial) before and is willingto participate in CoopNet. There are several drawbackswith this method. It requires a huge buffer at the client tostore the entire video. In case only partial of the video isfound at the serving client, the requesting client has to lookfor the missing part from other clients, which increases theclient’s startup delay. Another problem is that CoopNetputs heavy control overhead on the source because thesource is required to maintain full knowledge of all distri-bution trees.

Xu et al. [18] considers media streaming while takingpeers’ heterogeneity in bandwidth capacity into consider-ation. That is, with the limitation of outbound bandwidth,multiple supplying peers are needed to provide service forone receiving peer. They first look at the problem of mediadata assignment, determining which data segment of thevideo each supplying peer needs to send to the receivingpeer. Their solution is optimal in term of minimizing thebuffer delay. The paper also proposes a technique toamplify the capacity of the streaming system. While [18]assumes many-to-one relationship between supplying peerand requesting peer, we suppose that the relation is one-to-one in our P2VoD system. In other words, a supplyingpeer in P2VoD can provide the entire video stream to arequesting peer without the help of additional supplyingpeers.

Perhaps the closest work to our paper is P2Cast [8].P2Cast [8] is an extension of patching, and the differenceis that a late client can get a patch not only from the server(as in original patching) but also from other clients. OurP2VoD is different from P2Cast in a couple of ways. First,clients in P2VoD always cache the most recent content ofthe video stream, while clients in P2Cast only cache the ini-tial part of the video. Due to the caching schemes used, atany time only one stream from an early client is needed toserve a late client in P2VoD, while two such streams, apatching and a base stream, are used at the beginning todeliver the video to a late client in P2Cast. Second, just likeSpreadIt [3], P2Cast has to get the source involved when-ever a failure occurs, thus vulnerable to disruption due toserver bottleneck at the source. In addition, orphaned peersreconnect by using the join algorithm, resulting in longblocking time before their service can resume. On the con-

emand streaming in peer-to-peer environments, Comput. Com-

Page 13: Robust video-on-demand streaming in peer-to-peer environments

T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx 13

ARTICLE IN PRESS

trary, in P2VoD, failures are handled locally and withoutinvolvement of the source.

The preliminary version of P2VoD is published in 2004[6]. Since then, a number of new techniques have been pro-posed to support P2P streaming service, including bothVoD and live streaming. In [20], the authors proposeGloVE (Global Video Environment) to assess the efficiencyof reuse techniques in practical situations. GloVE uses acentralized directory service as opposed to a distributeddirectory service in P2VoD. Moreover, GloVE assumesthe availability of IP Multicast, while P2VoD only assumesIP Unicast. In [21], the authors propose AnySee, a P2P livestreaming system. While many studies on live-streamingoverlay construction focus on intra-overlay optimization,AnySee adopts an inter-overlay optimization scheme, inwhich resources can join multiple overlays simultaneously.The authors have implemented and released the AnySeesoftware on the Internet, and AnySee has been utilized ina number of actual events in China. Our P2VoD is differentfrom AnySee in a couple of ways. First, AnySee is designedfor live streaming, while P2VoD is designed for video-on-demand. Second, P2VoD is about on intra-overlay optimi-zation, while AnySee is about inter-overlay optimization.One of the interesting research problems is to adopt theinter-overlay optimization in AnySee to P2P video-on-demand systems. In [22], the authors propose an OverlaySubscription Network (OSN) for live Internet TV broad-cast. The key of this paper is to provide an incentive mech-anism to recruit non-active peers to participate in astreaming session. With the addition of the new recruitingpeers, it is more likely that OSN can build a better multi-cast tree with respect to resources used in the underlyingnetwork. In [23], the authors focus on the following peer-to-peer service scheduling problem: Given a client request-ing a video, which is typically available on a number ofservers, which server should be used to serve the client?They propose a scheduling technique, called Shaking.Shaking makes it possible for a client to be served by a ser-ver that is beyond the client’s own search scope. Further-more, the match between the servers and clients can bedynamically adjusted to minimize client service latency.

6. Conclusions

In this paper, we have presented a new system for Video-On-Demand streaming in a Peer-to-Peer environment,called P2VoD. The novel ideas in this paper are the cachingscheme, the distributed directory service and the generationconcept. The analytical analysis in Section 3 has proventhat the design of P2VoD is sound and efficient. Specifi-cally, P2VoD maximizes peers’ potential, while P2Castdoes not. Moreover, under certain assumptions, we showthat the join and failure recovery overheads are small, inthe order of O(1). Additionally, P2VoD achieves thesedesirable characteristics with small control overhead. Wealso carry simulation-based study under more realisticassumptions. The simulation results also confirm that

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

P2VoD performs better than P2Cast in terms of importantperformance metrics.

Acknowledgment

The authors thank the US National Science Foundationfor partially funding the work in this paper under GrantANI-0088026.

References

[1] D.A. Tran, K.A. Hua, T.T. Do, A peer-to-peer architecture for mediastreaming, IEEE Journal on Selected Areas in Communications 22 (1)(2004) 121–133.

[2] S. Banerjee, B. Bhattacharjee, C. Kommareddy, Scalable applicationlayer multicast, in: SIGCOMM, 2002, pp. 205–217.

[3] M.B.H. Deshpande, H. Garcia-Molina, Streaming live media over apeer-to-peer network, work at CS-Stanford (2002).

[4] V.N. Padmanabhan, H.J. Wang, P.A. Chou, K. Sripanidkulchai,Distributing streaming media content using cooperative networking,in: NOSSDAV, 2002, pp. 177–186.

[5] E. Veloso, V. Almeida, W. Meira, A. Bestavros, S. Jin, A hierarchicalcharacterization of a live streaming media workload, in: InternetMeasurement Workshop, 2002, pp. 117–130.

[6] T.T. Do, K.A. Hua, M.A. Tantaoui, P2vod: Providing fault tolerantvideo-on-demand streaming in peer-to-peer environment, in: IEEEInternational Conference on Communications, Paris, France, 2004,pp. 1467–1472.

[7] S. Sheu, K.A. Hua, W. Tavanapong, Chaining: A generalizedbatching technique for video-on-demand, in: IEEE Int’l Conf. OnMultimedia Computing and System (ICMCS), 1997, pp. 110–117.

[8] Y. Guo, K. Suh, J.F. Kurose, D.F. Towsley, P2cast: Peer-to-peerpatching scheme for vod service, in: WWW, 2003, pp. 301–309.

[9] H.V. Madhyastha, T. Isdal, M. Piatek, C. Dixon, T. Anderson, A.Krishnamurthy, A. Venkataramani, iplane: An information plane fordistributed services, in: OSDI, 2006.

[10] E.W. Zegura, K.L. Calvert, S. Bhattacharjee, How to model aninternetwork, in: INFOCOM, 1996, pp. 594–602.

[11] K.A. Hua, M.A. Tantaoui, Cost effective and scalable videostreaming techniques, in: O.M. Borko Furht (Ed.), Handbook ofVideo Databases, CRC press, 2002.

[12] Y.-H. Chu, S.G. Rao, H. Zhang, A case for end system multicast, in:SIGMETRICS, 2000, pp. 1–12.

[13] M. Bawa, H. Deshpande, H. Garcia-Molina, Transience of peers &streaming media, Comput. Commun. Rev. 33 (1) (2003) 107–112.

[14] S. Zhuang, B.Y. Zhao, A.D. Joseph, R.H. Katz, J. Kubiatowicz,Bayeux: An architecture for scalable and fault-tolerant wide-area datadissemination, in: NOSSDAV, 2001, pp. 11–20.

[15] D.E. Pendarakis, S. Shi, D.C. Verma, M. Waldvogel, Almi: Anapplication level multicast infrastructure, in: USITS, 2001, pp. 49–60.

[16] D.A. Tran, K.A. Hua, S. Sheu, A new caching architecture forefficient video-on-demand services on the internet, in: SAINT, 2003,pp. 172–181.

[17] J. Jannotti, D.K. Gifford, K.L. Johnson, M.F. Kaashoek, J.W.O’Toole Jr., Overcast: Reliable multicasting with an overlay network,in: OSDI, 2000, pp. 197–212.

[18] D. Xu, M. Hefeeda, S.E. Hambrusch, B.K. Bhargava, On peer-to-peer media streaming, in: ICDCS, 2002, pp. 363–371.

[19] Y. Guo, K. Suh, J. Kurose, D. Towsley, A peer-to-peer on-demandstreaming service and its performance evaluation, in: IEEE Int. Conf.on Multimedia Expo (ICME), 2003.

[20] L.B. de Pinho, C.L. de Amorim, Assessing the efficiency of streamreuse techniques in p2p video-on-demand systems, J. Netw. Comput.Appl. 29 (1) (2006) 25–45.

[21] X. Liao, H. Jin, Y. Liu, L.M. Ni, D. Deng, Anysee: Peer-to-peer livestreaming, in: INFOCOM, 2006.

emand streaming in peer-to-peer environments, Comput. Com-

Page 14: Robust video-on-demand streaming in peer-to-peer environments

14 T.T. Do et al. / Computer Communications xxx (2007) xxx–xxx

ARTICLE IN PRESS

[22] Y. Cai, J. Zhou, An overlay subscription network for live internet tvbroadcast, IEEE Trans. Knowledge Data Eng. 18 (12) (2006) 1711–1720.

[23] Y. Cai, A. Natarajan, J. Wong, On scheduling of peer-to-peer videoservices, IEEE J. Sel. Areas Commun. 25 (1) (2007) 140–145.

Tai T. Do is currently a Ph.D. Candidate inComputer Science at the University of CentralFlorida, working in the Data Systems Labora-tory. He received a B.S. degree in ElectricalEngineering from the University of Oklahoma in2001, and a M.S. degree in Computer Sciencefrom the University of Central Florida in 2005.His current research interest is search and deliv-ery techniques in peer-to-peer networks. He haspublished some papers on query managementtechniques in moving databases, information

sharing and routing in mobile ad hoc networks, video streaming onoverlay networks, and video broadcasting over the Internet. At the Uni-

versity of Central Florida, he is a recipient of the Graduate ResearchFellowship, Top 100 Who’s Who, and the Graduate Enhancement Award.He is a student member of IEEE, ACM, and UPE.

Kien A. Hua received the B.S. degree in Com-puter Science, M.S. and Ph.D. degrees in Elec-trical Engineering, all from the University ofIllinois at Urbana-Champaign, in 1982, 1984, and1987, respectively. From 1987 to 1990, he waswith IBM Corporation. He joined the Universityof Central Florida in 1990, and is currently aprofessor in the School of Electrical Engineeringand Computer Science. He served as the InterimAssociate Dean for Research of the College ofEngineering and Computer Science from 2003 to

2005. Dr. Hua has published widely including nine papers recognized as

Please cite this article in press as: T.T. Do et al., Robust video-on-dmun. (2007), doi:10.1016/j.comcom.2007.08.024

among the best papers at various international conferences. As a memberof the research community, Dr. Hua has served as General Co-Chair,Vice-Chair, Associate Chair, Demo Chair, and Program CommitteeMember for numerous conferences. He has also been on a number ofeditorial boards, and is currently an Associate Editor of the Journal ofMultimedia Tools and Applications.

Mounir A. Tantaoui is an assistant professor inComputer Science at the School of Science andEngineering, Al Akhawayn University in Ifrane,Morocco. He obtained his Ph.D. and M.S.degrees, both in Computer Science, from theUniversity of Central Florida in 1999 and 2004respectively. His main research interests aremultimedia networking, broadcasting techniquesin mobile ad hoc networks, and distributed andinteractive virtual environment systems. He is astudent member of IEEE and ACM.

emand streaming in peer-to-peer environments, Comput. Com-