collaborative content delivery werner vogels robbert van renesse, ken birman dept. of computer...

29
Collaborative Collaborative Content Delivery Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for web-based publish/subscribe .: DRAFT :.

Upload: nigel-frank-norris

Post on 12-Jan-2016

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

Collaborative Collaborative Content DeliveryContent Delivery

Werner VogelsRobbert van Renesse, Ken BirmanDept. of Computer Science, Cornell University

A peer-to-peer solution for web-based publish/subscribe

.: DRAFT :.

Page 2: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Presentation duality …Presentation duality …

The case for Collaborative Content The case for Collaborative Content DeliveryDelivery

vsvs The innovative technology used to The innovative technology used to

build the systembuild the system Spectacularly scalable technologySpectacularly scalable technology Secure, reliable, robust & fastSecure, reliable, robust & fast A solution to many distributed A solution to many distributed

management problemsmanagement problems

Page 3: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Epidemic Theory of Infectious Diseases and its Applications

N.T.J. BaileyHafner Press

Second Edition, 1975

Late night readingLate night reading

Page 4: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

The ProblemThe Problem

Access to real-time information at Access to real-time information at syndicated news sites is highly syndicated news sites is highly inefficientinefficient

An estimated 70%-80% of the An estimated 70%-80% of the bandwidth is wasted on redundant bandwidth is wasted on redundant transport both at the consumer and transport both at the consumer and at the publisherat the publisher

Consumers frequently return to the Consumers frequently return to the website to receive timely updateswebsite to receive timely updates

Page 5: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Isn’t this solved already?Isn’t this solved already?

RSS – channels provide summaries for RSS – channels provide summaries for processing by bots.processing by bots. But the mechanism remains “But the mechanism remains “pullpull””

HTTP – Delta should reduce bw costHTTP – Delta should reduce bw cost News feeds from major vendorsNews feeds from major vendors

““pushpush” is the right model for frequently ” is the right model for frequently changing data with timely deliverychanging data with timely delivery

Proprietary formats and high feesProprietary formats and high fees Email summary as cheap alternativeEmail summary as cheap alternative Still high bandwidth cost at the publisherStill high bandwidth cost at the publisher

Hybrid “Hybrid “push/pullpush/pull” by organizations ” by organizations exploiting distributed content deliveryexploiting distributed content delivery

Page 6: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Scale is a major obstacleScale is a major obstacle

No coordinated action by No coordinated action by syndication sites to provide shared syndication sites to provide shared information push infrastructureinformation push infrastructure

The one-to-many technologies used The one-to-many technologies used currently are inherently not scalablecurrently are inherently not scalable

No technology is available that can No technology is available that can deliver data from thousands deliver data from thousands publishers to millions of subscribers publishers to millions of subscribers in real-time.in real-time.

Page 7: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

We can do betterWe can do better

Current push solutions fail to exploit the Current push solutions fail to exploit the collaborative power of the Internetcollaborative power of the Internet

Ideally the publishers inject one update Ideally the publishers inject one update into the world and all interested into the world and all interested subscribers will receive this.subscribers will receive this.

In this model all consumers are In this model all consumers are collaborating to route the information to collaborating to route the information to right subscribersright subscribers

The information arrives at all desktops The information arrives at all desktops within tens of seconds after publishingwithin tens of seconds after publishing

Page 8: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Peer-to-Peer SolutionPeer-to-Peer Solution

P2P is the only approach to a cost P2P is the only approach to a cost effective, scalable solutioneffective, scalable solution

Subscribers weave an ad-hoc Subscribers weave an ad-hoc infrastructure for subscription based infrastructure for subscription based routing routing

Scalable, autonomous & Scalable, autonomous & decentralized managementdecentralized management

High level of robustness and High level of robustness and reliability in message deliveryreliability in message delivery

Authentication of publishersAuthentication of publishers

Page 9: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Emerging technologiesEmerging technologies

Astrolabe, CAN, Cord, Pastry, are Astrolabe, CAN, Cord, Pastry, are emerging research technologies.emerging research technologies.

Astrolabe the furthest inAstrolabe the furthest in ScalabilityScalability Security integrationSecurity integration Manageable Manageable Firewall, proxy and NAT supportFirewall, proxy and NAT support

Complete technology that we are Complete technology that we are now using to develop applicationsnow using to develop applications

Page 10: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Astrolabe/MarinerAstrolabe/Mariner

A system for ultra-scalable, A system for ultra-scalable, distributed state managementdistributed state managementRobustRobust, through the use of epidemic , through the use of epidemic

techniquestechniques ScalableScalable, through the use of , through the use of

information aggregation and fusioninformation aggregation and fusion SecureSecure, through certificates, through certificates FlexibleFlexible,, through secure mobile code through secure mobile code

Simulated, Emulated, Tested and Simulated, Emulated, Tested and Deployed.Deployed.

Page 11: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

AstrolabeAstrolabe

Robust and Scalable Technology for Distributed System Monitoring, Management and Data Mining

Page 12: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Distributed Systems ManagementDistributed Systems Management

Is extremely important in the Is extremely important in the deployment of large systemsdeployment of large systems

Scalable managementScalable management of applications of applications and systems is still a major Questand systems is still a major Quest

Management technology needs to be Management technology needs to be integrated into applicationsintegrated into applications

The management subsystem is often The management subsystem is often more complex than the application more complex than the application itselfitself

Page 13: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

AstrolabeAstrolabe

Information/state management Information/state management systemsystem

Monitors the dynamically changing Monitors the dynamically changing state of sets of distributed resourcesstate of sets of distributed resources

Reports summaries to its consumersReports summaries to its consumers Uses information hierarchies to Uses information hierarchies to

organize the dataorganize the data Uses aggregation techniques to Uses aggregation techniques to

continuously compute the summary continuously compute the summary nodes in the systemnodes in the system

Page 14: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Current use of MarinerCurrent use of Mariner

Monitor and control applications, Monitor and control applications,

systems and infrastructuresystems and infrastructure

Resource discoveryResource discovery

Collaboration managementCollaboration management

Coordination of distributed tasksCoordination of distributed tasks

Edge-caching controlEdge-caching control

CDN dynamic managementCDN dynamic management

Page 15: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Intuitively Intuitively

You can see mariner as a large You can see mariner as a large database with information about the database with information about the global systemglobal system

None of this information resides on a None of this information resides on a single serversingle server

Each principal has a row in the virtual Each principal has a row in the virtual database in which it is allowed to update database in which it is allowed to update with <attribute, value> pairs.with <attribute, value> pairs.

A principal can only directly access the A principal can only directly access the rows of other nodes in its zone and its rows of other nodes in its zone and its intermediate nodes in the hierarchy to intermediate nodes in the hierarchy to the root.the root.

Page 16: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Mariner in a single zoneMariner in a single zone

Name1Name1 LoadLoad Weblogic?Weblogic? SMTP?SMTP? Word Word VersionVersion

……

swiftswift 2.02.0 00 11 6.26.2

falconfalcon 1.51.5 11 00 4.14.1

cardinalcardinal 4.54.5 11 00 6.06.0

Lowest level in the hierarchies can be nodes or Lowest level in the hierarchies can be nodes or finer grained if the application requires itfiner grained if the application requires it

Security key for zone needed to add a new column; Security key for zone needed to add a new column; user key needed to update rowuser key needed to update row

Page 17: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Scalability through HierarchyScalability through Hierarchy

Leafs are organized into zonesLeafs are organized into zones Each leaf has a self-managed attribute Each leaf has a self-managed attribute

listlist The base zone is the collection of The base zone is the collection of

individual attribute lists of its leafsindividual attribute lists of its leafs Each intermediate zone is the collection Each intermediate zone is the collection

of attribute list constructed out of of attribute list constructed out of aggregation of the information in its child aggregation of the information in its child zoneszones

Each list has some basic attributes, that Each list has some basic attributes, that Mariner uses to manage itself such Mariner uses to manage itself such contact lists, timestamps, etc.contact lists, timestamps, etc.

Page 18: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Simple HierarchySimple Hierarchy

NameName LoadLoad Weblogic?Weblogic? SMTP?SMTP? Word Word VersionVersion

……

swiftswift 2.02.0 00 11 6.26.2

falconfalcon 1.51.5 11 00 4.14.1

cardinalcardinal 4.54.5 11 00 6.06.0

NameName LoadLoad Weblogic?Weblogic? SMTP?SMTP? Word Word VersionVersion

……

gazellegazelle 1.71.7 00 00 4.54.5

zebrazebra 3.23.2 00 11 6.26.2

gnugnu .5.5 11 00 6.26.2

NameName Avg Avg LoadLoad

WL contactWL contact SMTP contactSMTP contact

SFSF 2.62.6 123.45.61.3123.45.61.3 123.45.61.17123.45.61.17

NJNJ 1.81.8 127.16.77.6127.16.77.6 127.16.77.11127.16.77.11

ParisParis 3.13.1 14.66.71.814.66.71.8 14.66.71.1214.66.71.12

San Francisco

New Jersey

Page 19: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Information AggregationInformation Aggregation

Aggregation functions are Aggregation functions are programmableprogrammable

Subset of SQLSubset of SQL Code is embedded in aggregation Code is embedded in aggregation

function certificates (AFC)function certificates (AFC) Signed certificate is installed into Signed certificate is installed into

an attribute listan attribute list Used to construct (new) attributes Used to construct (new) attributes

in zones of the hierarchyin zones of the hierarchy

Page 20: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Epidemic DisseminationEpidemic Dissemination

Each Astrolabe instance maintains Each Astrolabe instance maintains all the zones on its path to the rootall the zones on its path to the root

No centralized servers for No centralized servers for intermediate zonesintermediate zones

Consequently each instance has a Consequently each instance has a copy of the root zonecopy of the root zone

Replication is achieved through Replication is achieved through gossip techniques.gossip techniques.

Guarantees Guarantees eventual consistencyeventual consistency

Page 21: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

AFC propagationAFC propagation

1.1. Output of the AFC includes a copy of it Output of the AFC includes a copy of it self – results in a copy of the AFC into self – results in a copy of the AFC into the parent zonethe parent zone

Reaches the root and other zone leafsReaches the root and other zone leafs

2.2. AdoptionAdoption – check the ancestors lists to – check the ancestors lists to find new AFC’sfind new AFC’s

Spreads through the system in the Spreads through the system in the order of tens of seconds.order of tens of seconds.

Certificates have an expiration date, Certificates have an expiration date, unless refreshed aggregation eventually unless refreshed aggregation eventually halts halts

Page 22: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

I’ll skipI’ll skip

Aggregation function detailsAggregation function details Mobile code detailsMobile code details Eventual consitencyEventual consitency CertificatesCertificates AuthenticationAuthentication Firewalls, & nat’sFirewalls, & nat’s

Page 23: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Robustness through GossipRobustness through Gossip

Use of Epidemic Techniques to Use of Epidemic Techniques to disseminate data and AFC’sdisseminate data and AFC’s

Pure peer-to-peer communicationPure peer-to-peer communication Full autonomous progressFull autonomous progress Actions based on probability theoryActions based on probability theory Robustness improves with scaleRobustness improves with scale Fixed low overhead, independent of Fixed low overhead, independent of

scalescale Control as well as Data transportControl as well as Data transport

Page 24: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

GossipGossip

Conceptually: each zone periodically picks another zone Conceptually: each zone periodically picks another zone at random and exchanges the state of those zonesat random and exchanges the state of those zones

Slightly more complex because there are virtual zones Slightly more complex because there are virtual zones ……

Page 25: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Gossip target selectionGossip target selection

AsiaAsia CornellCornell Node1Node1 SystemSystem

EuropeEurope MITMIT Node2Node2 InventoryInventory

USAUSA USCDUSCD Node3Node3 MonitorMonitor

U-WashU-Wash Node4Node4

1. Each instance update the issued attribute, evaluates depending AFC’s

2. An agent (instance) will gossip on behalf of those zones for which it is a contact, with a rate depending on configuration

3. At each level pick at random a child from the contact list and exchange state

Page 26: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

MembershipMembership

Failure detectionFailure detection If no update seen for an agent in time If no update seen for an agent in time

TTfailfail, remove it from the system, remove it from the system

IntegrationIntegration After partitions, crashes, etc. After partitions, crashes, etc.

renegate trees can be formedrenegate trees can be formed Use of broadcast, multicast, hints, to Use of broadcast, multicast, hints, to

discover other agentsdiscover other agents

Page 27: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Subscription routingSubscription routing

At the leafs the subscribers store At the leafs the subscribers store subscription informationsubscription information

Aggregation functions combine the Aggregation functions combine the subscriptions of participants into subscriptions of participants into subscriptions for the zonesubscriptions for the zone

Publishers use Publishers use zone.send(subscription, data)zone.send(subscription, data)

which is forwarded if the zone has which is forwarded if the zone has children that match the subscriptionchildren that match the subscription

Page 28: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

Routing infrastructureRouting infrastructure

Each zone dynamically selects 2-3 Each zone dynamically selects 2-3 routing nodes using AFC’s using routing nodes using AFC’s using various load factorsvarious load factors

These nodes receive news items for These nodes receive news items for their children in their zonetheir children in their zone

Forwarding based on the individual Forwarding based on the individual subscription informationsubscription information

Redundancy used to achieve Redundancy used to achieve robustness and reliabilityrobustness and reliability

Page 29: Collaborative Content Delivery Werner Vogels Robbert van Renesse, Ken Birman Dept. of Computer Science, Cornell University A peer-to-peer solution for

© Copyright 2002 Werner Vogels

SummarySummary