reverse-query mechanism for contents delivery management ... · abstract: - it is difficult to...
TRANSCRIPT
Reverse-Query Mechanism for Contents Delivery Management over Unstructured Overlay Network
YOSHIKATSU FUJITA†, †† ,YASUFUMI SARUWATARI†, JUN YOSHIDA†† and KAZUHIKO
TSUDA†
†Graduate School of Business Science, University of Tsukuba †† Corporate eNet Business Division, Matsushita Electric Industrial Co., Ltd.
†3-29-1 Otsuka, Bunkyo-ku, Tokyo, 112-0012 †† 2-13-10 Kyobashi, Chuo-ku, Tokyo, 104-0031
JAPAN
Abstract: - It is difficult to build contents delivery platform over the unstructured P2P network because the traffic generated by P2P clients are unmanageable. By applying percolation theory to propagate newly defined reverse-query messages over the unstructured P2P network, we propose novel contents delivery network architecture built over existing unstructured P2P network. We analyzed our algorithm and examined the validity of our model, that will cover more than 80% of all the clients, with relaying reverse-query messages under probability as low as 10%, being effective to reduce the total traffic generated by query propagation. This architecture can be applicable for quasi-broadcast over the Internet. Key-Words: - Overlay Network, Percolation Theory, CDN, Power law distribution, Broadcast 1 Introduction In Japan, domestic network infrastructure expands rapidly thanks to governmental e-Japan strategy. Today, number of broadband users sums up to 20 million and overall penetration rate for households counts for more than 40%. In addition, digital broadcast which began in December 2003, and standardization activity for server type broadcasting, expects breakthrough for digital contents market demands. With the emergence of such broadband contents delivery infrastructure, people can enjoy any contents which satisfy their own lifestyle among a huge amount of contents archives.
Hshthbc1cnpfotuO to pull large video streams at the same time, it is clear that backbone network easily
Internet, CDN(Contents Delivery Network) ar
e traffic over the Internet. Existing tec n nc b
owever, when it comes to the matter of how we ould deliver broadband contents over the Internet, e total throughput will be determined by any
ottleneck somewhere between contents provider to onsumers. For example, even a user purchases 00Mbps optical fiber service, one's favorite contents omes from distant server only 100Kbps because of arrow path somewhere over the Internet. Most of roxy servers distributed over the Internet are aimed r static contents like homepage objects, and are not ned for high quality video stream contents. nce many users try
falls into overflow. This requires a new contents delivery technology which supports a huge simultaneous access transaction for broadband objects.
For this purpose, to deliver broadband contents over the
chitecture [1], [2] and contents distribution algorithm for replication [3], [4] are actively studied. But such CDN solutions for large scale contents delivery faces difficulty because the number of acceptable simultaneous access is almost determined by hardware specification of cache servers, and this falls into optimal cache server distribution problem with considering dynamic request load balance under the exact forecast of contents popularity and hardware availability.
This is also regarded as a big issue for realizing broadcast typ
hnique to handle telephone call over the telephoneetwork is specific for point to point traffic, which isot applicable for clearing simultaneous access to a ontents sever, which makes it difficult to deliverroadband contents to many clients.
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
2 P2P Network ur study aimed for overcoming this problem by
verlay network over the age generated traffic under
t is in re P2P net b u " a th to m p in o
the P2P Network
2.1 Recently
rtant role as
u ecture which has no center
se r find available re rnetwork. In this field of study, such projects as C [ [11] are tr le (DHT)[12].
th
h nd more, whose amount of loan
manage such huge traffic, it will be good news for Internet backbone operators. There are severa pattern parameter" [1 H way dev ba
ere is a report from [14]. nd for modeling method of network with a power w link distribution, there is an algorithm by [5] with
model. However, this method foafon Internet is not fully analyzed yet.
F
Obuilding our proposed oInternet, and to manour control, to be new quasi-broadcast platform. We develop new contents delivery network for broadband contents based on the fact that many link status on the Internet follows power law distribution [5]. For example, one of the most popular pure P2P network Gnutella has been analyzed that its nodes' outgoing degree can be expressed as )0(k~)( - ≥ττkP [6]. However, such an unstructured network is not manageable in nature, and makes it difficult to apply for contents delivery for its “unstructured” architecture. In this paper, we employ percolation theory [7] tha
mainly studied in physics, to model the "percolatingformation" for contents delivery over the puwork. In other words, the query message released
y client seeking for requested contents is not merelysed for this purpose, but we define a newreverse-query" message to find any client who needs certain contents and try to apply the percolationeory to manage the generated traffic. This will lead reduce the explosive P2P query traffic whileaintaining fairly high clients cover rate over our
roposed overlay network. We will expect rapidformation propagation among clients who joins this
verlay network. An outlook of our model is shown in
Fig.1 . Fig. 1 Message Propagation over
Scalability and reliability er server How to keep anonymity and privacy Adhoc join and leave behavior When we apply P2P network for contents delivery,
nder the pure P2P archit
P2P Network Structure , file sharing application over P2P network
has been pervasive and it plays an impocontents distribution infrastructure. P2P network holds such technical problems in nature as:
Interoperability between peers without cent
rve , it is major concern how to sou ces and required contents from all over the
AN 8]�Chord [9]�Pastry [10]�Tapestryying to employ Distributed Hash TabHowever, in order to apply for commercial services,ere exist more problems in this DHT solution. When peers join/leave the network, they have to
takeover management table to some other peers, results in heavy overhead.
Network structure becomes complicated. In addition, traffic generated by those P2P nodes
as been increasing more ad gives serious impact to today's ISP backbone
etwork. If we can
l ways to detect P2P traffic as: "by signature", "by application layer", "by flow
3]. owever, those solutions are to insert gateices into the network and force to control the
andwidth upon detected information. This requires dditional hardware costs.
2.2 Query Traffic In this study, we propose new network architecture for broadband contents delivery which can solve problems shown above and applicable for quasi-broadcast platform over the internet.
To estimate the amount of total traffic for retrieval query sent by each client, thAlapreferential attachment
cuses mainly on the growth of network itself and nalyzing dynamic traffic behavior has been still left r further study. Also, validation of those theoretical
etwork structure and real P2P network on the
ig. 2 Do-you-have? and Do-you-need? Query
Peer
“Do-you-have”
Peer Peer
Peer
Have? Have?
Yes, I have!
H
“Do-you-need” query (reverse-query)
Peer
Peer
Peer
Peer Need?
Need?
Need?
Need?
Yes, I need! ave? Yes, I need!
Propagating Messages
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
Percolation theory is introduced in [7]. Sarshar [15] applied this percolation theory for propagating
m tudy conclude query transfer has finished when the query re gianP2P network, which does not mee e . Iour stud , a nformation of contents attributes in the form of deliverying "reverse-query" ("Do-y odel) and thin the contents will try to download the contents from its previous peers. This concept will lead to a new contents delivery platform of Reverse-query m
ig. 3 Reverse-query message structure
.3, Message
a m th of
me is xprocedure. Step1 : Message Distribution T delivered generates a "do-you-need" message and sends it to som SF c p c o he "do-you-need" message to implant its own IP showing it is retrieving the
to some randomly
se
c cted to just forward the message to some randomly selected its n
m o SSfr c Step2, and if the thit Wh a law network, the message can be propagated all over the peers [15]. 3In r p 3First, let us suppose two typical cases based on each
ode's behavior. (1En er of forwarding hops to cover the whole network is dependent on the shape of network, however, it is clear that minimum
rchitecture that all nodes a m radius of network. The size of existing P2P network is studied by the method p (2 s According to Relay Probability C at query is transfered
retrieval query over the network ("Do-you-have" odel), but this s
aches any t node onctive
the t our obj n
y we spre d i
ou-need" m ose who have interests
echanism. Fig.2 shows basic concepts of "Do-you-have"query and "Do-you-need" query. 3 Reverse-query Mechanism 3.1 Algorithm In this study, we propose the new algorithm that propagates reverse-query message and contents by applying percolation theory to P2P network.
.2 Proposed Method this chapter, we will evaluate the feasibility of ou
roposed mechanism.
.2.1 Basic Model F
The message struID, relay probability
cture is shown in Fig; TTL, contents name, attributes
nd source node IP are kept in the header of theessage, followed by client node IP who downloadede contents. Those elements are minimum set
information that is to implement our proposed chan m. Here is an e ample of each node’s
he server which holds contents to be
e randomly selected peers.
tep2: Bond Percolation ollowing the step1, the first receiver peer Aompares the attribute of the contents with its ownreferences, and if they match, peer A tries to pull theontents file from the initial server. Following thisperation, peer A revise t
contents now, and forward it
lected neighbors. In case of peer A's favor does not match the
ontents attribute, peer A is expe
eighbors. By repeating this operation, the "do-you-need"essage" covers whole peers within a certain period
f time.
tep3: Contents Extraction uppose peer B receives the "do-you-need" message" om peer A. Peer B compares the attribute of theontents with its own tastes as peer A in
y match, peer B also tries to pull the contents frome previous peers listed in the relay chain. (This case, is probably from peer A)
en we choose the relay probability to be justbove the percolation threshold of underlying power
n
) Transfer All Queries ach node forward the received query to its all eighbor nodes. The necessary numb
number is 1 (= star shape are reachable just 1 hop from the origin) andaximum number is half the
roposed in [16].
) Transfer Querie
onsider the probability thbetween node a and b. Let p to be the probability that a link exists between those two nodes. Then, the query willbe transfered from a to b: (i) a -> b
is p because a and b are directly connected.
Implant own IP (possibly until use up the TTL)
Content Name
Content attribute
Source IP(Server)
Client IP (who DL)
Client IP (whoDL)
C(w
lient IPhoDL
)
Relay Probability
TTL (decrease on every hop)
Message ID
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
(ii) a -> c -> b Let the probability that c transfers query to be α, then the query will be transferred a -> b
( ) pp •−• 1α (1) and the probabilita
y that query will not be transferred -> b is ( ) ( ) pp •−•− 11 α (2)
(iii) a -> c -> d -> b In the same way, let the probability c and d transfers qu b
( ) pp •α (3) This can be extended to n nodes case between a and b
3
on generalized random
Suppose the degree d e p(k), then the general function of this distribution can b
ery to be α, then the query will be transferred a ->−• 22 1
( ) pp nn •−• 1α (4)
.2.2 Percolation on Generalized Random Graph The percolation behaviorgraph can be led as follows [17].
istribution of each node to b
e defined as:
∑∞
=
vertex dist component on gen s g
o be )(1 xH . Then d
ponent >< || 0c to be
=0
0 )()(k
kxkpxG (5)
Suppose the ribution of connectederalized random graph hold
eneral function )(0 xH , and general function for thesize of connected component from a certain branch t
the average size of connectecom
)1()1(1)1(|| '1
'0
'00 HGHC +=>=<
)1(1)1(
1 '1
'0
GG−
+= (6)
T si
= 01
he state transition will take place when right-handde becomes 0,
∑ −⇔= kpkkG )()2(1)1('
k
22
=>< k
(7) >< k
Here, the percolation threshold can be
⇔
1
12
−><
=
kk
qc (8)
If we assume our delivery platform to be
eneralized random grap an say t
g hat almost all erred messages when each
node relay more than cq . 3.2T
(1) From Ori t Neighbor
away
mploy the expression nmk , which shows
th(lv
o T network [6wbe reachable origin is
As there a whole, then the total num s is
∑=
−0
11 )1(
k
i
ik (10)
(2) Dup
h, we cnode can receive the transf
s received messages
.3 Percolation on Power law Overlay Network he percolation behavior on the power law overlay
network can be led as follows.
gin to the FirsWe will consider the number of vertex that is 1 hop
from the origin.
Fig. 4 First neighbors
Let us ee degree of vertex to be m hops away from origin ower right) and nth vertex out of the set of m ertexes (upper right). Fig.4 shows an example. Then, remember the assumption that every node the network has degree according to power law.n
his assumption is observed from real P2P ]. When we assume the origin holds connection ith 0k vertexes, then the number of vertex that can
within 1 hop from11
1 −k (9) are 0k vertexes as
ber of reachable vertexe
licated First Neighbor
Fig. 5 Duplicated path
Origin
1
41k
1k
21k
31k
51k
ik1
Origin
jk1
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
Then, we will reduce the duplicated number in the ase of graph of Fig.5.
In the Fig.5, we need to subtract the paths ik to
j and jk to ik , be
c
k cause both ik and jk are counted as ju
T t k
st 1 hop away from the origin. he expected v ue for the number of vertex, tha
i is 2 hops away from the origin is al
∑i n
(21
01 1)
ber of vertexes D, which is re can be expressed from
−−−i
kk
)1 (1
Let the total numachable within 1 hop from origin
(10) and (11)
∑∑=i
k(1
1 −−−
−−=i
iki k
nk
D )1(21
)1 01
0
(12)
As this number is overestimated, it is enough to employ this equation to calculate the reachability from the origin. (3) DistThe number of vertex V that m hops away from
ant Neighbor m
the origin can be expressed as
∑ +−
=i
m kDmn
kV )(
1 10
2 (13)
By using recursive equation, the number of vertex hops away from origin can be obtained in the sam
−
i
n wa
ssage to m n the overlay network by appl g to deli e as:
VV
d
4.1 Simulation In p d rated by Pajek [18], and im R environment. This o ted based on generalized
me of gaining an edge,
xture of preferential ta
genn d
nare probability implanted in the reverse query message. 4.2 Results Fig.6 through Fig.10 shows typical examples of message propagation over this generated overlay n e examples, we fixed the ae
i. otrthpT
thshlop
ey. When we want to deliver a certain meore than 80 % of nodes o
ying (4), it is enou h ver the messag8.0)1( ≥•−• pp mmα (14)
In the next chapter, we have investigated the ynamics of α upon mathematical simulation.
4 Evaluation
order to evaluate our proposed model, werepared a random network with a power law linkistribution gene
plemented our algorithm onverlay network is genera
BA model, which presu s that every vertex has at least some baseline probabilityto generate edges by miattachment and uniform at chment [19]. For
erating condition, we set the total number of odes N=1000, 0M =3, TTL=5 to 25 and averageegree = 2.0 to 3.5. In order to evaluate the results, we counted the
umber of generated messages (= reverse-query) nd cover rate (=how much of nodes receives the verse-query) upon relay
etwork. In those threverage degree of nodes to be 2.7 and compared the ffects of TTL.
The outlook of those graphs looks quite similar, e., the propagating message will cover almost 80%f nodes upon certain relay probability, and the total affic increase linearly. The only difference among ose results is the scale of relay probability, which
roves our proposed model will work even lower TL. In Fig.6 of TTL=5, the cover rate reaches more an 80% on relay probability of 0.3, while in Fig.8 ows relay probability of 0.07 of TTL=25. By oking at equation (14), it is enough to set relay
robability only 0.07 to cover 80% of nodes to deliver reverse-query message over the nodes joining this overlay network.
Fig. 6 Results (1) (TTL=5, Ave.Deg. = 2.7)
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
Fig. 7 Results (2) (TTL=10, Ave.Deg. = 2.7)
Fig. 8 Results (3) (TTL=25, Ave.Deg. = 2.7)
Next, we tried to look into the effect of average degree of nodes in Fig.9 and Fig.10. Though the link degree distribution in the real world is reported to b eme case of average degree to 3.5. In both cases, TTL is
average degree 2.0, the wer than Fig.8
does not make that covers
average degree 3.5, average message
of Fig.8.
e mostly 2 to 3 [20], we examined as the extr
maintained to 25. In Fig.9 ofleading edge of left side is slightly loof average degree 2.7, but it difference in terms of relay probability80% of all nodes. In Fig.10 ofthe result looks quite similar to Fig.8 ofdegree 2.7. This can be said because the propagation is already saturated in case
ig. 9 Results (4) (TTL=25, Ave.Deg. = 2.0)
Fig. 10 Results(5) (TTL=25, Ave. Deg. =
4.3 Validity As we analyzed in the pr
3.5)
evious chapter, nodes which can be reachable from origin within n hops is expressed as equation (14). And if we set the relay probability o whole network as shown by simulation results. If we deliberately choose the relay probability, we can obtain
total number almost all the
er all over the m. All the
memories, just data to its next
As we observed in the simulation results, after se y of message body will just go out from the overlay network, and this will lead to reduce the explosive
2P query traffic while maintaining fairly high clients cover rate over our proposed overlay n 5Inmddcapgmecth
ptimally, the message will propagate to
the optimized condition that reducing theof generated messages while covering node over the network.
If we attach small video clip in the reverse-querymessage, it is possible to use this contents delivery mechanism as propagating electronic flyclients, making it as quasi-broadcast platfornodes do not necessarily have large buffering for few minutes and relay the peers who have interests in that contents' attributes.
veral steps of relaying messages, the cop
P
etwork.
Conclusion this paper, we employ percolation theory to odel the "message propagation" for contents
elivery over the unstructured P2P network. We efined new "reverse-query" message to find any lient who needs a certain contents and try to
ply the percolation theory to manage the enerated traffic. We analyzed validity of our odel through mathematical consideration and
xamined its dynamics by simulations. This can onclude that our proposal is effective to reduce e total amount of generated query traffic while
F
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)
mre cost for contents delivery ove
foinm R
] D.Karger, E.Lehman, T.Leighton, .Pnigrahy, M.Levine and D.Lewin: “Consistent
dom Trees: Distributed
[2 a, S : L h In . 1[3 : “ t D[4 f Pe w 1[5 l M x Networks”, Reviews of
ysics, Vol. 74, pp. 47-97 (2002)
PA[7P (1[8 arp aNA P Press, pp.161-172 (2001) [9aPApplications”, Proceedings of ACM SIGCOMM, ACM Press, pp. 149-160 (2001)
[1DLin p. 329—350 (2[1 B in a DSA[1 naga, T.Hoshiai, S.Kamei and S.Kimura: “IEp[1 “U(2[1RP ol. 64, No. 046135 (2001) [1 “Making Unstructured Peer-to-Peer Networks S[1 nd Y.Takahashi:
80 (2005) (In Japanese) [1N[1h[1EAW S 2)
aintaining high cover rate, which means we can duce the hardware
r the Internet. We will continue our study to apply this model r distributed data management architecture, for stance, toward disaster management because our odel is sustainable for local breakdown.
eferences: [1RHashing and RanCaching Protocols for Relieving Hot Spots onthe World Wide Web”, Proceedings of the 29th
annual ACM Symposium on Theory of Computing, ACM Press, pp. 654-663 (1997) ] M.Abrams, C.R.Standridge, G.Abdull
.Williams and E.A. Fox: “Caching Proxiesimitations and Potentials”, Proceedings of 4tternational World Wide Web Conference, pp
19-133 (1995) ] Y.Chen, R.H.Katz and J.D.Kubiatowicz
Dynamic Replica Placement for Scalable Contenelivery”, IPTPS 2002, pp. 306-318 (2002) ] Y.Li and M.T.Liu: “Optimization orformance Gain in Content Distribution Networks
ith Server Replicas”, SAINT2003 Proceedings, pp.82-189 (2003) ] R.Albert and A.L.Barabasi: “Statisticaechanics of Comple
Modern Ph[6] M.Faloutsos, P.Faloutsos and C.Faloutsos: “On
ower-law Relationships of the Internet Topology”, CM SIGCOMM, pp. 251-262 (1999) ] D.Stauffer and A.Aharony: “Introduction to
ercolation Theory”, Taylor and Froncis, London994) ] S.Ratnasamy, P.Francis, M.Handley, R.K
nd S. Schenker: “A Scalable Content-addressable etwork”, Proceedings of the 2001 Conference on pplications, Technologies, Architectures, androtocols for Computer Communications, ACM
] I.Stoica, R.Morris, D.Karger, M.F.Kaashoek nd H. Balakrishnan: “Chord: A Scalable eer-to-Peer Lookup Service for Internet
0] A.Rowstron and P.Druschel: “Pastry: Scalable, ecentralized Object Location, and Routing for arge-Scale Peer-to-Peer Systems”, Lecture Notes Computer Science, Vol. 2218, p001) 1] K.Hildrum, J.D.Kubiatowicz, S.Rao and.Y.Zhao: “Distributed Object Location ynamic Network”, Proceedings of the 14th ACM ymposium on Parallel Algorithms and rchitectures, pp. 41-52 (2002) 2] H.Su
Technical Trends in P2P-Based Communications”, ICE TRANS. COMMUN, Vol. E87-B, No.10,
p.2831-2846 (2004) 3] S.Inoue, A.Suzaki, S.Kamei and T.Ohtani:
Fundamental Knowledge for P2P Technology[2]”, NIX Magazine, Vol.20, No.10, ASCII, pp. 91-117 005) (In Japanese) 4] B.H.L.A.Adamic, A.R.Puniyani and .M.Lukose: “Search in power-law networks”, hysical Review E, V5] N.Sarshar, P.O.Boykin, V.P.Roychowdhury:
Percolation Search in Power Law Networks:
calable”, IEEE P2P2004 (2004) 6] S.Kamei, M.Uchida, T.Mori a
“Estimating Scale of Peer-to-Peer File Sharing Applications Using Multi-Layer Partial Measurement”, IEICE Transactions on Communications, Vol. J88-B, No. 11, pp. 2171-21
7] N.Masuda and N.Konno: “Science of Complex etwork”, Sangyo Tosho (2005) (In Japanese) 8] Pajek:
ttp://vlado.fmf.uni-lj.si/pub/networks/pajek/ 9] D.M.Pennock, G.W.Flake, S.Lawrence, .J.Glover and C.L. Giles: “Winners Don't Take ll: Characterizing the Competition for Links on the eb”, Proceedings of the National Academy of
ciences, Vol. 99, No.8, pp. 5207-5211 (200[20] M.E.J.Newman: “The Structure and Function of Complex Networks”, SIAM Review, Vol.45, No.2, pp. 167-256 (2003)
Proceedings of the 10th WSEAS International Conference on COMMUNICATIONS, Vouliagmeni, Athens, Greece, July 10-12, 2006 (pp607-613)