feedtree: sharing web micronews with peer-to-peer event notification d. sandler, a. mislove, a....

Post on 19-Dec-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

FeedTree: Sharing Web Micronews with Peer-to-Peer

Event Notification

D. Sandler, A. Mislove,A. Post, P. Druschel

Presented by: Andrew Sutton

Contributions

• Propose alternative to RSS distribution architecture

• Use peer-to-peer technology to reduce network load

RSS Distribution

• RSS (Real Simple Syndication) - XML format for publishing micronews

• Feed - a source of RSS items• Content Provider - responsible for

publishing RSS feeds• Reader/Aggregator - user agent

responsible for RSS acquisition and display

RSS Distribution Network

• Readers poll content providers

• Request RSS files every ~30 minutes

• Readers can be online, requesting 24/7

Problems with Distribution

• Polling - Requests occur on schedule

• Superfluity - Full response per request

• Stickiness - RSS traffic persists even if web traffic subsides

• 24 Hour Traffic - requests occur all day long

Network Load Example

• Updates occur every 30 minutes

• Slashdot– Subscribers: > 17,000– RSS file size: ~15KB– ~11.6GB/Day of RSS data

• Difficult to measure accurately

• No reliable statistics

Related Work

• Improved Polling

• Outsourced Aggregation

Improved Polling

• Improved Polling– Restrict reader polling via RSS– Use HTTP caching to reduce superfluous

responses– Use compress to reduce response size

• Delta Encoding– Only transmit what’s changed [RFC 3229]

– Seemingly ideal for RSS

Outsourced Aggregation

• Content Providers supply RPC interface to aggregator

• User readers query central server instead of providers

Outsourcing Problems

• Central aggregator allows– Single point of failure for readers– Censorship of original content– Modification of original content (i.e., ads)

• May not be reliable or trustworthy

FeedTree

• Eliminate network/provider load

• Uses peer-to-peer subscription

• Use hybrid push/pull mechanism for timely distribution/update of micronews

• Signed documents to enable trust

FeedTree Architecture

Pastry

• Enables Peer-to-Peer networking applications– Self-organizing - nodes added, removed

dynamically– Network overlay - efficiently routes

messages in participating nodes

• Applications: Scribe, SplitStream

Overlay Network

• Logical network built on top of actual network

• Can define virtual routes between nodes

• Common approach for P2P networks

Pastry Network

• Based on a circular namespace of node id’s (not tree-oriented)

• Routing– Shortest-path based on routing– Non-receivers forward message to next-

closest (proximity) node– Routes messages in O(logn) time

Scribe

• Group Communication and Event Notification– Highly dynamic groups (based on topics)– Uses publish/subscribe model– Allows application-level multicast and

anycast

• Applications: FeedTree, ???

Scribe Multicast

• Subscribing to a topic– Subscriber knows publisher’s node id– Sends “subscribe” message– Forwarding nodes become parents in the multi-

cast tree (keeps track of children)

• Notification of event– Events are multicast to all children of publisher,

forwarders

• One multicast tree per topic

FeedTree Distribution

• Subscription– Readers subscribe to a feed (i.e., Scribe

topic)

• Publication– Each item is given timestamp, sequence id– Document is signed with publishers private

key

FeedTree Delivery

• Bootstrap Delivery– Signed RSS document is multicast to

overlay network– Essentially, a combined subscribe/request

operation

• Incremental Delivery– Only new items are multicast– If no changes, multicast a “heartbeat”

Missed Deliveries

• If reader is missing sequence numbers– Query parent for missing items– Nodes must buffer last n items to make re-

delivery more efficient– If items still missing, query publisher

Publisher Delivery Tree

Network Overhead

• Assume an RSS feed generating 4KB/hour

• Interior node in tree with 16 children forwards < 20B/sec

• However…– Unknown how this scales for large

providers, large readers

Implementation

• Implemented both publisher/reader software (proxies)

• Created testbed website for real distribution of RSS feeds

• No substantial experimentation

http://www.feedtree.net

Advantages/Disadvantages

• Benefits - lower cost of delivering micronews– (Significantly) reduced provider load– No fear of being RSS feeds being

“slashdotted”

• Differentiated services - different feeds for headlines/full news

Disadvantages

• Requires specialized software for publishers/subscribers

• P2P denial of service attacks– Malicious nodes may not forward events

Conclusions

• End users receive better service than currently possible

• Foresee new services based on RSS – Storing every single RSS item published on

the internet– Anonymous feeds using anonymizing p2p

routing algorithms– Cooperative multicast to distribute realtime

media

Evaluation

• Good– Appears to be well-reasoned idea– Developed software to test hypothesis– Good workshop paper

• What’s needed for research– More detailed description of protocol– Substantiate claims about performance

(i.e., experiment)

Questions

1. List four problems with the current RSS feed distribution model.

2. Which two of these four problems have the largest impact on network load?

Questions

3. How long does it take Pastry to route a message if there are n nodes in the network?

4. Suppose Slashdot has 50,000 RSS subscribers through FeedTree. What is the approximate depth of the multicast tree for the Slashdot topic?

Questions

5. Assume that there are 100,000 FeedTree topics on a Pastry network that all update at 4KB/Hour. An interior node with 16 children will send 20B/sec. Suppose an interior node participates in all feeds. What is the expected output (in B/sec) of this node?

top related