squirrel: a peer-to- peer web cache sitaram iyer (rice university) joint work with ant rowstron (msr...
TRANSCRIPT
![Page 1: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/1.jpg)
Squirrel: A peer-to-peer web
cacheSitaram Iyer (Rice University)
Joint work withAnt Rowstron (MSR Cambridge)Peter Druschel (Rice University)
PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA
![Page 2: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/2.jpg)
Web Caching
1. Latency, 2. External traffic,3. Load on web servers and routers.
Deployed at: Corporate network boundaries, ISPs, Web Servers, etc.
![Page 3: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/3.jpg)
Centralized
Web Cache
Web Cache
BrowserBrowser
Cache
WebServer
BrowserBrowser
Cache
Client
Client
InternetCorporate LAN
![Page 4: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/4.jpg)
InternetCorporate LAN
Cooperative Web Cache
BrowserBrowser
Cache
WebServer
BrowserBrowser
Cache
Client
ClientWeb
Cache
Web Cache
Web Cache
Web Cache
Web Cache
![Page 5: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/5.jpg)
Internet
Decentralized Web Cache
Browser
WebServer
BrowserBrowser
Cache
Client
Client
Corporate LAN
Browser
Cache
Squirrel
![Page 6: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/6.jpg)
Distributed Hash Table
Peer-to-peer location service: Pastry
• Completely decentralized and self-organizing• Fault-tolerant, scalable, efficient
Operations:
Insert(k,v)Lookup(k)
k6,v6
k1,v1
k5,v5
k2,v2
k4,v4
k3,v3
nodes
<key,value>
Peer-to-peer routing and
location substrate
![Page 7: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/7.jpg)
Why peer-to-peer?
1. Cost of dedicated web cache No additional hardware
2. Administrative effortSelf-organizing network
3. Scaling implies upgrading Resources grow with clients
![Page 8: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/8.jpg)
Setting
•Corporate LAN
•100 - 100,000 desktop machines
•Located in a single building or campus
•Each node runs an instance of
Squirrel•Sets it as the browser’s proxy
![Page 9: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/9.jpg)
Mapping Squirrel onto Pastry
Two approaches:
• Home-store
• Directory
![Page 10: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/10.jpg)
Home-store model
client
homeLANInternet
URL hash
![Page 11: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/11.jpg)
Home-store model
client
home
…that’s how it works!
![Page 12: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/12.jpg)
Directory model
Client nodes always cache objects locally.
Home-store: home node also stores objects.
Directory: the home node only stores pointers to recent clients, and forwards requests.
![Page 13: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/13.jpg)
Directory model
client
home
InternetLAN
![Page 14: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/14.jpg)
Directory model
client
homeRandomly choose
entry from table
![Page 15: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/15.jpg)
Directory: Advantages
Avoids storing unnecessary copies of objects.
Rapidly changing directory for popular objects seems to improve load balancing.
Home-store scheme can incur hotspots.
![Page 16: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/16.jpg)
Directory: Disadvantages
Cache insertion only happens at clients, so:
• active clients store all the popular objects,
• inactive clients waste most of their storage.
Implications:1. Reduced cache size.2. Load imbalance.
![Page 17: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/17.jpg)
Directory: Load spike example
• Web page with many embedded images, or
• Periods of heavy browsing.
Many home nodes point to such clients!
Evaluate …
![Page 18: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/18.jpg)
Trace characteristics
Microsoft in : Redmond
Cambridge
Total duration 1 day 31 days
Number of clients 36,782 105
Number of HTTP requests
16.41 million
0.971 million
Peak request rate606 req/sec
186 req/sec
Number of objects5.13 million
0.469 million
Number of cacheable objects
2.56 million
0.226 million
Mean cacheable object reuse
5.4 times 3.22 times
![Page 19: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/19.jpg)
Total external traffic
85
90
95
100
105
0.001 0.01 0.1 1 10 100
Directory
Home-store
No web cache
Centralized cache
Redm
ond
[low
er
is b
ett
er]
Per-node cache size (in MB)Tota
l exte
rnal tr
affi
c (
GB
)
![Page 20: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/20.jpg)
Total external traffic
5.5
5.6
5.7
5.8
5.9
6
6.1
0.001 0.01 0.1 1 10 100
Tota
l exte
rnal tr
affi
c (
GB
)[l
ow
er
is b
ett
er]
Per-node cache size (in MB)
Directory
Home-store
No web cache
Centralized cache
Cam
brid
ge
![Page 21: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/21.jpg)
LAN Hops
0%
20%
40%
60%
80%
100%
0 1 2 3 4 5 6
Total hops within the LAN
Redm
ond
Centralized Home-store Directory
% o
f cach
eab
le r
eq
uests
![Page 22: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/22.jpg)
LAN Hops
0%
20%
40%
60%
80%
100%
0 1 2 3 4 5
% o
f cach
eab
le r
eq
uests
Centralized Home-store Directory
Cam
brid
ge
Total hops within the LAN
![Page 23: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/23.jpg)
Load in requests per sec
1
10
100
1000
10000
100000
0 10 20 30 40 50
Nu
mb
er
of
tim
es o
bserv
ed
Max objects served per-node / second
Home-storeDirectoryRed
mon
d
![Page 24: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/24.jpg)
Load in requests per sec
1
10
100
1000
10000
100000
1e+06
1e+07
0 10 20 30 40 50
Nu
mb
er
of
tim
es o
bserv
ed
Max objects served per-node / second
Home-storeDirectoryCa
mbr
idge
![Page 25: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/25.jpg)
Load in requests per min
1
10
100
0 50 100 150 200 250 300 350
Nu
mb
er
of
tim
es o
bserv
ed
Max objects served per-node / minute
Home-storeDirectoryRed
mon
d
![Page 26: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/26.jpg)
Load in requests per min
1
10
100
1000
10000
0 20 40 60 80 100 120
Nu
mb
er
of
tim
es o
bserv
ed
Max objects served per-node / minute
Home-storeDirectoryCa
mbr
idge
![Page 27: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/27.jpg)
Fault tolerance
Sudden node failures result inpartial loss of cached content.
Home-store: Proportional to failed nodes.
Directory: More vulnerable.
![Page 28: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/28.jpg)
Fault tolerance
Home-store
Directory
Redmond
Mean 1%
Max 1.77%
Mean 1.71%
Max 19.3%
Cambridge
Mean 1%
Max 3.52%
Mean 1.65%
Max 9.8%
If 1% of Squirrel nodes abruptly crash, the fraction of lost cached content is:
![Page 29: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/29.jpg)
Conclusions
• Possible to decentralize web caching.
• Performance comparable to a centralized web cache,
• Is better in terms of cost, scalability, and administration effort, and
• Under our assumptions, the home-store scheme is superior to the directory scheme.
![Page 30: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/30.jpg)
Other aspects of Squirrel
•Adaptive replication–Hotspot avoidance– Improved robustness
•Route caching–Fewer LAN hops
![Page 31: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/31.jpg)
Thanks.
![Page 32: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/32.jpg)
(backup) Storage utilization
Redmond Home-store Directory
Total 97641 MB 61652 MB
Mean per-node 2.6 MB 1.6 MB
Max per-node 1664 MB 1664 MB
![Page 33: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/33.jpg)
(backup) Fault tolerance
Home-store Directory
EquationsMean H/OMax Hmax /O
Mean (H+S)/OMax max(Hmax,Smax)/O
Redmond
Mean 0.0027%Max 0.0048%
Mean 0.198%Max 1.5%
Cambridge
Mean 0.95%Max 3.34%
Mean 1.68%Max 12.4%
![Page 34: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/34.jpg)
(backup) Full home-store protocol
server
client
otherother
req
home
req
req
a : object or notmod from home
b : object or notmod from origin3
1
b2
(WAN)(LAN)
origin
b : req
![Page 35: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/35.jpg)
(backup) Full directory protocol
dir
server
servere : cGET req
origin
origin
otherother
req
home
req
client
req
2
b : not-modified
3
e3
21c ,e : req
c ,e : object1
4a , d
2a , d : req 1a : no dir, go to origin. Also d2
3
1
not-modifiedobject or
dele-gate
![Page 36: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/36.jpg)
(backup) Peer-to-peer Computing
Decentralize a distributed protocol:– Scalable
– Self-organizing
– Fault tolerant
– Load balanced
Not automatic!!
![Page 37: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/37.jpg)
Decentralized Web Cache
Browser
Browser
Browser
Cache
Browser
Cache
Web
Server
LAN Internet
![Page 38: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/38.jpg)
Challenge
Decentralized web caching algorithm:
Need to achieve those benefits in practice!
Need to keep overhead unnoticeably low.
Node failures should not become significant.
![Page 39: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/39.jpg)
Peer-to-peer routing, e.g., Pastry
Peer-to-peer object location and routing substrate = Distributed Hash Table.
Reliably maps an object key to a live node.
Routes in log16(N) steps
(e.g. 3-4 steps for 100,000 nodes)
![Page 40: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/40.jpg)
Home-store is better!
Simpler home-store scheme achieves load balancing by hash function randomization.
Directory scheme implicitly relies on access patterns for load distribution.
![Page 41: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/41.jpg)
Directory scheme seems better…
Avoids storing unnecessary copies of objects.
Rapidly changing directory for popular objects results in load balancing.
![Page 42: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/42.jpg)
Interesting difference
Consider:– Web page with many images, or– Heavily browsing node
Directory: many pointers to some
node.
Home-store: natural load balancing.
Evaluate …
![Page 43: Squirrel: A peer-to- peer web cache Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University) PODC 2002](https://reader035.vdocument.in/reader035/viewer/2022062618/55147546550346494e8b6284/html5/thumbnails/43.jpg)
Fault tolerance
Home-store
Directory
Redmond
Mean
0.0027%
Max 0.0048%
Mean 0.2%
Max 1.5%
Cambridge
Mean
0.95%
Max 3.34%
Mean 1.7%
Max 12.4%
When a single Squirrel node crashes, the fraction of lost cached content is: