count / top-k continuous queries on p2p networks 01/11/2006

Post on 21-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Count / Top-k Continuous Queries on P2P Networks

01/11/2006

Outline

Problem Definition P2P Architecture Count Top-K Experiment Setup Future Work

Streaming Data in P2P

P2PDynamic changing topology, large scale, …

Streaming dataContinuous, unbounded, rapid, time-varying,

noise P2P + Streaming data

Dynamic in both data and topology

Objective and Goal

Objective Issue a continuous query to estimate count and

top-K Goal

Lower down the communication costLightweight maintenanceApproximated answersAn adaptive and progressive approach

Naïve approach

Flooding the overlay continuousPros

Closer to the exact answer

Cons Network congestion Still non-real time

The State-of-the-Art

CountFocus on one-time answer in P2PDeal with streaming data only

Top-KP2P environment without streaming dataDistributed environment not P2P

P2P architecture

AssumptionHierarchical P2P (Focused)

Super-peer hierarchical structure Query issuer is a super-peer Super peer connect with other super peers Each peer belongs to only one super peer

Pure unstructured P2P

Big picture

Group

Accumulate information within a group based on the constraintand statistics

Set Constraint

Report changes

Approximated answer

Group in hierarchical P2P

Issuer

Coordinator

Peer

Group in hierarchical P2P

3

1

4

2

Group in hierarchical P2P

4

3

3

1

4

2

Group in hierarchical P2P

4

3

3

1

4

2

After partition

Group1

Group3Group2

,01,... 0ii N C

Assume we have N objects and K Groups after partition

,

:1, ...,

:1, ...,

: Count at each peeri j

i N

j K

C

User-specified Epsilon

Group1

Group3Group2

User-specifiedε(Precision)

Consider a group

P4

P1

P3

P2

CoordinatorNode

Objects

O1

O2

O3

Each node maintain the distribution information of owning objects

P2

P4

P1

P3

object

Rate

#

R1

R2

R3

R4

At initial - Polling

P4

P1

P3

P2

CoordinatorNode

At initial - Polling

P4

P1

P3

P2

CoordinatorNode

Information at coordinator after polling

object

#

22

2633

P4

P3P2

P1

Statistics information

object

# P1 P2 P3 P4 ΔO1 1/1 6/6 10/10 5/5 22O2 11/11 13/13 5/5 9/7 36O3 15/15 6/6 3/3 9/9 33R 0.3 0.2 -0.05 0.6T 15 15 17 13

22

2633

Updated time stamp

Maximum changing rate(+/-) of objects in each peer

Change value for each objectLatest real value

Estimated value

Update to Coordinator

(Δ11, Δ21, Δ31)

T2

(Δ12, Δ22, Δ32)

(Δ13, Δ23, Δ33)

Calculate Count

( 1) ( ),0 ,0 ,

1

Kl li i i j

j

C C

Redistribute Epsilon

,0( , , )i if C

wi=Max(Δi)/Cx,0 where x is the i-index of Max(Δi)δi=wiεCx,0/ ∑wi

Visiting sequence

P4

P3P2

P1

Pick those peers would violate δ

Update information

Group

P1 P2 P3 P4 ΔO1 1/1 6/6 10/10 8/8 -O2 11/11 11/11 5/5 6/6 -O3 15/15 5/5 3/3 11/11 -R 0.3 0.4 -0.05 0.2T 15 30 17 33

For those nodes not being visited

Group

P1 P2 P3 P4 ΔO1 1/2 6/6 10/9 8/8 25O2 11/13 11/11 5/4 6/6 34 O3 15/18 5/5 3/2 11/11 36 R 0.3 0.4 -0.05 0.2T 15 30 17 33

Un-notified Leave

P1

Ping

P1 is dead

Remove P1’s information

P4

P3P2

Experiment Setup

Generate synthetic data set by statistics distribution for Streaming dataLife time of peers

MetricsMessage sizeCommunication costResponse latencyResult accuracy

Top-K

Use Regression to predicate the reasonable trend of changesOnce a updated result is required, Super Peer

only need to ask those doubtful peers for doubtful objects

Update its counting list, and return the top k objects

Future Work

Connect and recommend latent good friends for each user Good friends: the ones with the same interests (behaviors)

Exploiting current connecting peers to discover good friends bit by bit

Design a system that could make clusters reflecting current interests of individual peers and connecting them together based on their similarity by using user’s social network

Advantages

Reduce search time and diminish query traffic by using friends list

By utilizing their different strength of arcs/edges/ties = friendshipness, social networks exceed random-walk networks in quickly finding target objects

Example

Level 1

Level 2

Example

has larger weight than

Score(Ni)

Score(Ni)

1 1( ) ( , ) ( )i i i iscore N sim N N score N

Similarity

top related