reverse-query mechanism for contents delivery management ... · abstract: - it is difficult to...

Reverse-Query Mechanism for Contents Delivery Management over Unstructured Overlay Network

YOSHIKATSU FUJITA†, †† ,YASUFUMI SARUWATARI†, JUN YOSHIDA†† and KAZUHIKO

TSUDA†

†Graduate School of Business Science, University of Tsukuba †† Corporate eNet Business Division, Matsushita Electric Industrial Co., Ltd.

†3-29-1 Otsuka, Bunkyo-ku, Tokyo, 112-0012 †† 2-13-10 Kyobashi, Chuo-ku, Tokyo, 104-0031

JAPAN

Abstract: - It is difficult to build contents delivery platform over the unstructured P2P network because the traffic generated by P2P clients are unmanageable. By applying percolation theory to propagate newly defined reverse-query messages over the unstructured P2P network, we propose novel contents delivery network architecture built over existing unstructured P2P network. We analyzed our algorithm and examined the validity of our model, that will cover more than 80% of all the clients, with relaying reverse-query messages under probability as low as 10%, being effective to reduce the total traffic generated by query propagation. This architecture can be applicable for quasi-broadcast over the Internet. Key-Words: - Overlay Network, Percolation Theory, CDN, Power law distribution, Broadcast 1 Introduction In Japan, domestic network infrastructure expands rapidly thanks to governmental e-Japan strategy. Today, number of broadband users sums up to 20 million and overall penetration rate for households counts for more than 40%. In addition, digital broadcast which began in December 2003, and standardization activity for server type broadcasting, expects breakthrough for digital contents market demands. With the emergence of such broadband contents delivery infrastructure, people can enjoy any contents which satisfy their own lifestyle among a huge amount of contents archives.

Hshthbc1cnpfotuO to pull large video streams at the same time, it is clear that backbone network easily

Internet, CDN(Contents Delivery Network) ar

e traffic over the Internet. Existing tec n nc b

owever, when it comes to the matter of how we ould deliver broadband contents over the Internet, e total throughput will be determined by any

ottleneck somewhere between contents provider to onsumers. For example, even a user purchases 00Mbps optical fiber service, one's favorite contents omes from distant server only 100Kbps because of arrow path somewhere over the Internet. Most of roxy servers distributed over the Internet are aimed r static contents like homepage objects, and are not ned for high quality video stream contents. nce many users try

falls into overflow. This requires a new contents delivery technology which supports a huge simultaneous access transaction for broadband objects.

For this purpose, to deliver broadband contents over the

chitecture [1], [2] and contents distribution algorithm for replication [3], [4] are actively studied. But such CDN solutions for large scale contents delivery faces difficulty because the number of acceptable simultaneous access is almost determined by hardware specification of cache servers, and this falls into optimal cache server distribution problem with considering dynamic request load balance under the exact forecast of contents popularity and hardware availability.

This is also regarded as a big issue for realizing broadcast typ

hnique to handle telephone call over the telephoneetwork is specific for point to point traffic, which isot applicable for clearing simultaneous access to a ontents sever, which makes it difficult to deliverroadband contents to many clients.

Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)

2 P2P Network ur study aimed for overcoming this problem by

verlay network over the age generated traffic under

t is in re P2P net b u " a th to m p in o

the P2P Network

2.1 Recently

rtant role as

u ecture which has no center

se r find available re rnetwork. In this field of study, such projects as C [ [11] are tr le (DHT)[12].

th

h nd more, whose amount of loan

manage such huge traffic, it will be good news for Internet backbone operators. There are severa pattern parameter" [1 H way dev ba

ere is a report from [14]. nd for modeling method of network with a power w link distribution, there is an algorithm by [5] with

model. However, this method foafon Internet is not fully analyzed yet.

F

Obuilding our proposed oInternet, and to manour control, to be new quasi-broadcast platform. We develop new contents delivery network for broadband contents based on the fact that many link status on the Internet follows power law distribution [5]. For example, one of the most popular pure P2P network Gnutella has been analyzed that its nodes' outgoing degree can be expressed as )0(k~)( - ≥ττkP [6]. However, such an unstructured network is not manageable in nature, and makes it difficult to apply for contents delivery for its “unstructured” architecture. In this paper, we employ percolation theory [7] tha

mainly studied in physics, to model the "percolatingformation" for contents delivery over the puwork. In other words, the query message released

y client seeking for requested contents is not merelysed for this purpose, but we define a newreverse-query" message to find any client who needs certain contents and try to apply the percolationeory to manage the generated traffic. This will lead reduce the explosive P2P query traffic whileaintaining fairly high clients cover rate over our

roposed overlay network. We will expect rapidformation propagation among clients who joins this

verlay network. An outlook of our model is shown in

Fig.1 . Fig. 1 Message Propagation over

Scalability and reliability er server How to keep anonymity and privacy Adhoc join and leave behavior When we apply P2P network for contents delivery,

nder the pure P2P archit

P2P Network Structure , file sharing application over P2P network

has been pervasive and it plays an impocontents distribution infrastructure. P2P network holds such technical problems in nature as:

Interoperability between peers without cent

rve , it is major concern how to sou ces and required contents from all over the

AN 8]�Chord [9]�Pastry [10]�Tapestryying to employ Distributed Hash TabHowever, in order to apply for commercial services,ere exist more problems in this DHT solution. When peers join/leave the network, they have to

takeover management table to some other peers, results in heavy overhead.

Network structure becomes complicated. In addition, traffic generated by those P2P nodes

as been increasing more ad gives serious impact to today's ISP backbone

etwork. If we can

l ways to detect P2P traffic as: "by signature", "by application layer", "by flow

3]. owever, those solutions are to insert gateices into the network and force to control the

andwidth upon detected information. This requires dditional hardware costs.

2.2 Query Traffic In this study, we propose new network architecture for broadband contents delivery which can solve problems shown above and applicable for quasi-broadcast platform over the internet.

To estimate the amount of total traffic for retrieval query sent by each client, thAlapreferential attachment

cuses mainly on the growth of network itself and nalyzing dynamic traffic behavior has been still left r further study. Also, validation of those theoretical

etwork structure and real P2P network on the

ig. 2 Do-you-have? and Do-you-need? Query

Peer

“Do-you-have”

Peer Peer

Peer

Have? Have?

Yes, I have!

H

“Do-you-need” query (reverse-query)

Peer

Peer

Peer

Peer Need?

Need?

Need?

Need?

Yes, I need! ave? Yes, I need!

Propagating Messages


Percolation theory is introduced in [7]. Sarshar [15] applied this percolation theory for propagating

m tudy conclude query transfer has finished when the query re gianP2P network, which does not mee e . Iour stud , a nformation of contents attributes in the form of deliverying "reverse-query" ("Do-y odel) and thin the contents will try to download the contents from its previous peers. This concept will lead to a new contents delivery platform of Reverse-query m

ig. 3 Reverse-query message structure

.3, Message

a m th of

me is xprocedure. Step1 : Message Distribution T delivered generates a "do-you-need" message and sends it to som SF c p c o he "do-you-need" message to implant its own IP showing it is retrieving the

to some randomly

se

c cted to just forward the message to some randomly selected its n

m o SSfr c Step2, and if the thit Wh a law network, the message can be propagated all over the peers [15]. 3In r p 3First, let us suppose two typical cases based on each

ode's behavior. (1En er of forwarding hops to cover the whole network is dependent on the shape of network, however, it is clear that minimum

rchitecture that all nodes a m radius of network. The size of existing P2P network is studied by the method p (2 s According to Relay Probability C at query is transfered

retrieval query over the network ("Do-you-have" odel), but this s

aches any t node onctive

the t our obj n

y we spre d i

ou-need" m ose who have interests

echanism. Fig.2 shows basic concepts of "Do-you-have"query and "Do-you-need" query. 3 Reverse-query Mechanism 3.1 Algorithm In this study, we propose the new algorithm that propagates reverse-query message and contents by applying percolation theory to P2P network.

.2 Proposed Method this chapter, we will evaluate the feasibility of ou

roposed mechanism.

.2.1 Basic Model F

The message struID, relay probability

cture is shown in Fig; TTL, contents name, attributes

nd source node IP are kept in the header of theessage, followed by client node IP who downloadede contents. Those elements are minimum set

information that is to implement our proposed chan m. Here is an e ample of each node’s

he server which holds contents to be

e randomly selected peers.

tep2: Bond Percolation ollowing the step1, the first receiver peer Aompares the attribute of the contents with its ownreferences, and if they match, peer A tries to pull theontents file from the initial server. Following thisperation, peer A revise t

contents now, and forward it

lected neighbors. In case of peer A's favor does not match the

ontents attribute, peer A is expe

eighbors. By repeating this operation, the "do-you-need"essage" covers whole peers within a certain period

f time.

tep3: Contents Extraction uppose peer B receives the "do-you-need" message" om peer A. Peer B compares the attribute of theontents with its own tastes as peer A in

y match, peer B also tries to pull the contents frome previous peers listed in the relay chain. (This case, is probably from peer A)

en we choose the relay probability to be justbove the percolation threshold of underlying power

n

) Transfer All Queries ach node forward the received query to its all eighbor nodes. The necessary numb

number is 1 (= star shape are reachable just 1 hop from the origin) andaximum number is half the

roposed in [16].

) Transfer Querie

onsider the probability thbetween node a and b. Let p to be the probability that a link exists between those two nodes. Then, the query willbe transfered from a to b: (i) a -> b

is p because a and b are directly connected.

Implant own IP (possibly until use up the TTL)

Content Name

Content attribute

Source IP(Server)

Client IP (who DL)

Client IP (whoDL)

C(w

lient IPhoDL

)

Relay Probability

TTL (decrease on every hop)

Message ID


(ii) a -> c -> b Let the probability that c transfers query to be α, then the query will be transferred a -> b

( ) pp •−• 1α (1) and the probabilita

y that query will not be transferred -> b is ( ) ( ) pp •−•− 11 α (2)

(iii) a -> c -> d -> b In the same way, let the probability c and d transfers qu b

( ) pp •α (3) This can be extended to n nodes case between a and b

3

on generalized random

Suppose the degree d e p(k), then the general function of this distribution can b

ery to be α, then the query will be transferred a ->−• 22 1

( ) pp nn •−• 1α (4)

.2.2 Percolation on Generalized Random Graph The percolation behaviorgraph can be led as follows [17].

istribution of each node to b

e defined as:

∑∞

=

vertex dist component on gen s g

o be )(1 xH . Then d

ponent >< || 0c to be

=0

0 )()(k

kxkpxG (5)

Suppose the ribution of connectederalized random graph hold

eneral function )(0 xH , and general function for thesize of connected component from a certain branch t

the average size of connectecom

)1()1(1)1(|| '1

'0

'00 HGHC +=>=<

)1(1)1(

1 '1

'0

GG−

+= (6)

T si

= 01

he state transition will take place when right-handde becomes 0,

∑ −⇔= kpkkG )()2(1)1('

k

22

=>< k

(7) >< k

Here, the percolation threshold can be

⇔

1

12

−><

=

kk

qc (8)

If we assume our delivery platform to be

eneralized random grap an say t

g hat almost all erred messages when each

node relay more than cq . 3.2T

(1) From Ori t Neighbor

away

mploy the expression nmk , which shows

th(lv

o T network [6wbe reachable origin is

As there a whole, then the total num s is

∑=

−0

11 )1(

k

i

ik (10)

(2) Dup

h, we cnode can receive the transf

s received messages

.3 Percolation on Power law Overlay Network he percolation behavior on the power law overlay

network can be led as follows.

gin to the FirsWe will consider the number of vertex that is 1 hop

from the origin.

Fig. 4 First neighbors

Let us ee degree of vertex to be m hops away from origin ower right) and nth vertex out of the set of m ertexes (upper right). Fig.4 shows an example. Then, remember the assumption that every node the network has degree according to power law.n

his assumption is observed from real P2P ]. When we assume the origin holds connection ith 0k vertexes, then the number of vertex that can

within 1 hop from11

1 −k (9) are 0k vertexes as

ber of reachable vertexe

licated First Neighbor

Fig. 5 Duplicated path

Origin

1

41k

1k

21k

31k

51k

ik1

Origin

jk1


Then, we will reduce the duplicated number in the ase of graph of Fig.5.

In the Fig.5, we need to subtract the paths ik to

j and jk to ik , be

c

k cause both ik and jk are counted as ju

T t k

st 1 hop away from the origin. he expected v ue for the number of vertex, tha

i is 2 hops away from the origin is al

∑i n

(21

01 1)

ber of vertexes D, which is re can be expressed from

−−−i

kk

)1 (1

Let the total numachable within 1 hop from origin

(10) and (11)

∑∑=i

k(1

1 −−−

−−=i

iki k

nk

D )1(21

)1 01

0

(12)

As this number is overestimated, it is enough to employ this equation to calculate the reachability from the origin. (3) DistThe number of vertex V that m hops away from

ant Neighbor m

the origin can be expressed as

∑ +−

=i

m kDmn

kV )(

1 10

2 (13)

By using recursive equation, the number of vertex hops away from origin can be obtained in the sam

−

i

n wa

ssage to m n the overlay network by appl g to deli e as:

VV

d

4.1 Simulation In p d rated by Pajek [18], and im R environment. This o ted based on generalized

me of gaining an edge,

xture of preferential ta

genn d

nare probability implanted in the reverse query message. 4.2 Results Fig.6 through Fig.10 shows typical examples of message propagation over this generated overlay n e examples, we fixed the ae

i. otrthpT

thshlop

ey. When we want to deliver a certain meore than 80 % of nodes o

ying (4), it is enou h ver the messag8.0)1( ≥•−• pp mmα (14)

In the next chapter, we have investigated the ynamics of α upon mathematical simulation.

4 Evaluation

order to evaluate our proposed model, werepared a random network with a power law linkistribution gene

plemented our algorithm onverlay network is genera

BA model, which presu s that every vertex has at least some baseline probabilityto generate edges by miattachment and uniform at chment [19]. For

erating condition, we set the total number of odes N=1000, 0M =3, TTL=5 to 25 and averageegree = 2.0 to 3.5. In order to evaluate the results, we counted the

umber of generated messages (= reverse-query) nd cover rate (=how much of nodes receives the verse-query) upon relay

etwork. In those threverage degree of nodes to be 2.7 and compared the ffects of TTL.

The outlook of those graphs looks quite similar, e., the propagating message will cover almost 80%f nodes upon certain relay probability, and the total affic increase linearly. The only difference among ose results is the scale of relay probability, which

roves our proposed model will work even lower TL. In Fig.6 of TTL=5, the cover rate reaches more an 80% on relay probability of 0.3, while in Fig.8 ows relay probability of 0.07 of TTL=25. By oking at equation (14), it is enough to set relay

robability only 0.07 to cover 80% of nodes to deliver reverse-query message over the nodes joining this overlay network.

Fig. 6 Results (1) (TTL=5, Ave.Deg. = 2.7)




Next, we tried to look into the effect of average degree of nodes in Fig.9 and Fig.10. Though the link degree distribution in the real world is reported to b eme case of average degree to 3.5. In both cases, TTL is

average degree 2.0, the wer than Fig.8

does not make that covers

average degree 3.5, average message

of Fig.8.

e mostly 2 to 3 [20], we examined as the extr

maintained to 25. In Fig.9 ofleading edge of left side is slightly loof average degree 2.7, but it difference in terms of relay probability80% of all nodes. In Fig.10 ofthe result looks quite similar to Fig.8 ofdegree 2.7. This can be said because the propagation is already saturated in case

ig. 9 Results (4) (TTL=25, Ave.Deg. = 2.0)

Fig. 10 Results(5) (TTL=25, Ave. Deg. =

4.3 Validity As we analyzed in the pr

3.5)

evious chapter, nodes which can be reachable from origin within n hops is expressed as equation (14). And if we set the relay probability o whole network as shown by simulation results. If we deliberately choose the relay probability, we can obtain

total number almost all the

er all over the m. All the

memories, just data to its next

As we observed in the simulation results, after se y of message body will just go out from the overlay network, and this will lead to reduce the explosive

2P query traffic while maintaining fairly high clients cover rate over our proposed overlay n 5Inmddcapgmecth

ptimally, the message will propagate to

the optimized condition that reducing theof generated messages while covering node over the network.

If we attach small video clip in the reverse-querymessage, it is possible to use this contents delivery mechanism as propagating electronic flyclients, making it as quasi-broadcast platfornodes do not necessarily have large buffering for few minutes and relay the peers who have interests in that contents' attributes.

veral steps of relaying messages, the cop

P

etwork.

Conclusion this paper, we employ percolation theory to odel the "message propagation" for contents

elivery over the unstructured P2P network. We efined new "reverse-query" message to find any lient who needs a certain contents and try to

ply the percolation theory to manage the enerated traffic. We analyzed validity of our odel through mathematical consideration and

xamined its dynamics by simulations. This can onclude that our proposal is effective to reduce e total amount of generated query traffic while

F


mre cost for contents delivery ove

foinm R

] D.Karger, E.Lehman, T.Leighton, .Pnigrahy, M.Levine and D.Lewin: “Consistent

dom Trees: Distributed

[2 a, S : L h In . 1[3 : “ t D[4 f Pe w 1[5 l M x Networks”, Reviews of

ysics, Vol. 74, pp. 47-97 (2002)

PA[7P (1[8 arp aNA P Press, pp.161-172 (2001) [9aPApplications”, Proceedings of ACM SIGCOMM, ACM Press, pp. 149-160 (2001)

[1DLin p. 329—350 (2[1 B in a DSA[1 naga, T.Hoshiai, S.Kamei and S.Kimura: “IEp[1 “U(2[1RP ol. 64, No. 046135 (2001) [1 “Making Unstructured Peer-to-Peer Networks S[1 nd Y.Takahashi:

80 (2005) (In Japanese) [1N[1h[1EAW S 2)

aintaining high cover rate, which means we can duce the hardware

r the Internet. We will continue our study to apply this model r distributed data management architecture, for stance, toward disaster management because our odel is sustainable for local breakdown.

eferences: [1RHashing and RanCaching Protocols for Relieving Hot Spots onthe World Wide Web”, Proceedings of the 29th

annual ACM Symposium on Theory of Computing, ACM Press, pp. 654-663 (1997) ] M.Abrams, C.R.Standridge, G.Abdull

.Williams and E.A. Fox: “Caching Proxiesimitations and Potentials”, Proceedings of 4tternational World Wide Web Conference, pp

19-133 (1995) ] Y.Chen, R.H.Katz and J.D.Kubiatowicz

Dynamic Replica Placement for Scalable Contenelivery”, IPTPS 2002, pp. 306-318 (2002) ] Y.Li and M.T.Liu: “Optimization orformance Gain in Content Distribution Networks

ith Server Replicas”, SAINT2003 Proceedings, pp.82-189 (2003) ] R.Albert and A.L.Barabasi: “Statisticaechanics of Comple

Modern Ph[6] M.Faloutsos, P.Faloutsos and C.Faloutsos: “On

ower-law Relationships of the Internet Topology”, CM SIGCOMM, pp. 251-262 (1999) ] D.Stauffer and A.Aharony: “Introduction to

ercolation Theory”, Taylor and Froncis, London994) ] S.Ratnasamy, P.Francis, M.Handley, R.K

nd S. Schenker: “A Scalable Content-addressable etwork”, Proceedings of the 2001 Conference on pplications, Technologies, Architectures, androtocols for Computer Communications, ACM

] I.Stoica, R.Morris, D.Karger, M.F.Kaashoek nd H. Balakrishnan: “Chord: A Scalable eer-to-Peer Lookup Service for Internet

0] A.Rowstron and P.Druschel: “Pastry: Scalable, ecentralized Object Location, and Routing for arge-Scale Peer-to-Peer Systems”, Lecture Notes Computer Science, Vol. 2218, p001) 1] K.Hildrum, J.D.Kubiatowicz, S.Rao and.Y.Zhao: “Distributed Object Location ynamic Network”, Proceedings of the 14th ACM ymposium on Parallel Algorithms and rchitectures, pp. 41-52 (2002) 2] H.Su

Technical Trends in P2P-Based Communications”, ICE TRANS. COMMUN, Vol. E87-B, No.10,

p.2831-2846 (2004) 3] S.Inoue, A.Suzaki, S.Kamei and T.Ohtani:

Fundamental Knowledge for P2P Technology[2]”, NIX Magazine, Vol.20, No.10, ASCII, pp. 91-117 005) (In Japanese) 4] B.H.L.A.Adamic, A.R.Puniyani and .M.Lukose: “Search in power-law networks”, hysical Review E, V5] N.Sarshar, P.O.Boykin, V.P.Roychowdhury:

Percolation Search in Power Law Networks:

calable”, IEEE P2P2004 (2004) 6] S.Kamei, M.Uchida, T.Mori a

“Estimating Scale of Peer-to-Peer File Sharing Applications Using Multi-Layer Partial Measurement”, IEICE Transactions on Communications, Vol. J88-B, No. 11, pp. 2171-21

7] N.Masuda and N.Konno: “Science of Complex etwork”, Sangyo Tosho (2005) (In Japanese) 8] Pajek:

ttp://vlado.fmf.uni-lj.si/pub/networks/pajek/ 9] D.M.Pennock, G.W.Flake, S.Lawrence, .J.Glover and C.L. Giles: “Winners Don't Take ll: Characterizing the Competition for Links on the eb”, Proceedings of the National Academy of

ciences, Vol. 99, No.8, pp. 5207-5211 (200[20] M.E.J.Newman: “The Structure and Function of Complex Networks”, SIAM Review, Vol.45, No.2, pp. 167-256 (2003)


reverse-query mechanism for contents delivery management ... · abstract: - it is difficult to...

Documents