coblitz: a scalable large-file transfer service (cos 461)
DESCRIPTION
CoBlitz: A Scalable Large-file Transfer Service (COS 461). KyoungSoo Park Princeton University. Large-file Distribution. Increasing demand for large files Movies or software release On-line movie / downloads Linux distribution Files are 100MB ~ tens of GB One-to-many downloads - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/1.jpg)
CoBlitz: A Scalable Large-file Transfer Service
(COS 461)
KyoungSoo ParkPrinceton University
![Page 2: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/2.jpg)
KyoungSoo Park 2
Large-file Distribution• Increasing demand for large files• Movies or software release
• On-line movie/ downloads• Linux distribution
• Files are 100MB ~ tens of GB• One-to-many downloads
How to serve large files to many clients? • Content Distribution Network(CDN)?• Peer-to-peer system?
![Page 3: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/3.jpg)
KyoungSoo Park 3
What CDNs Are Optimized For
Most Web files are small (1KB ~ 100KB)
![Page 4: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/4.jpg)
KyoungSoo Park 4
Why Not Web CDNs?
• Whole file caching in participating proxy• Optimized for 10KB objects• 2GB = 200,000 x 10KB
• Memory pressure• Working sets do not fit in memory• Disk access is 1000 times slower
• Waste of resources• More servers needed• Provisioning is a must
![Page 5: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/5.jpg)
KyoungSoo Park 5
Peer-to-Peer?
• BitTorrent takes up ~30% Internet BW
1. Download a “torrent” file
2. Contact the tracker
3. Enter the “swarm” network
4. Chunk exchange policy
- Rarest chunk first or random
- Tit-for-tat: incentive to upload
- Optimistic unchoking
5. Validate the checksums
torrenttracker
peers
updown
Benefit: extremely good use of resources!
![Page 6: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/6.jpg)
KyoungSoo Park 6
Peer-to-Peer?
• Custom software• Deployment is a must• Configurations needed
• Companies may want managed service• Handles flash crowds• Handles long-lived objects
• Performance problem• Hard to guarantee the service quality• Others are discussed later
![Page 7: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/7.jpg)
KyoungSoo Park 7
What We’d Like Is
Large-file service withNo custom clientNo custom serverNo prepositioningNo rehostingNo manual provisoning
![Page 8: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/8.jpg)
KyoungSoo Park 8
CoBlitz: Scalable Large-file CDN
• Reducing the problem to small-file CDN• Split large-files into chunks• Distribute chunks at proxies• Aggregate memory/cache • HTTP needs no deployment
• Benefits• Faster than BitTorrent by 55-86% (~500%) • One copy from origin serves 43-55 nodes• Incremental build on existing CDNs
![Page 9: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/9.jpg)
KyoungSoo Park 9
How It Works
Agent CDNClient
Only reverse proxy(CDN) caches the chunks!
CDN
CDNCDN
CDN ClientAgent
CDN
chunk1
chun
k 1
chunk 2
chunk 3
chunk 2
chunk 5
chunk 5
chunk 1
chunk 1
chunk 4 chunk 5 chunk 5
chun
k 4
chunk1 chunk2
chunk 3 chunk3
chunk5 chunk4
CDN = Redirector + Reverse ProxyDNS
coblitz.codeen.org
OriginServer
HTTP RANGE QUERY
![Page 10: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/10.jpg)
KyoungSoo Park 10
Smart Agent
• Preserves HTTP semantics• Parallel chunk requests
Client
sliding window of “chunks”
donedone
done
HTTP
CDN
CDN
CDN
CDNno action
CDN
no actionno action
waitingwaitingwaiting
done
waitingdone
waitingwaiting
Agent
![Page 11: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/11.jpg)
KyoungSoo Park 11
Chunk Indexing: Consistent Hashing
Static hashing f(x) = some_f(x) % n
But n is dynamic for servers - node can go down - new node can joinCDN node (proxy)
Problem: How to find the node responsible for a specific chunk?
Xk : Chunk request
X1
Consistent Hashing F(x) = some_F(x) % N (N is a large but fixed number)
Find a live node k, where|F(k) – F(URL) | is minimum
… N-1 0 …
X2
X3
![Page 12: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/12.jpg)
KyoungSoo Park 12
Operation & Challenges
• Provides public service over 2.5 years• http://coblitz.codeen.org:3125/URL
• Challenges• Scalability & robustness• Peering set difference• Load to the origin server
![Page 13: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/13.jpg)
KyoungSoo Park 13
Unilateral Peering
• Independent proximity-aware peering• Pick “n” close nodes around me• Cf. BitTorrent picks “n” nodes randomly
• Motivation• Partial network connectivity
• Internet2, CANARIE nodes• Routing disruption
• Isolated nodes
• Benefits• No synchronized maintenance problem• Improve both scalability & robustness
![Page 14: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/14.jpg)
KyoungSoo Park 14
Peering Set Difference
• No perfect clustering by design• Assumption
• Close nodes shares common peers
Both can reach Only can reach Only can reach
![Page 15: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/15.jpg)
KyoungSoo Park 15
Peering Set Difference
• Highly variable App-level RTTs• 10 x times variance than ICMP
• High rate of change in peer set
• Close nodes share less than 50%• Low cache hit• Low memory utility• Excessive load to the origin
![Page 16: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/16.jpg)
KyoungSoo Park 16
Peering Set Difference
• How to fix?• Avg RTT min RTT• Increase # of samples• Increase # of peers• Hysteresis
• Close nodes share more than 90%
![Page 17: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/17.jpg)
KyoungSoo Park 17
Reducing Origin Load
• Still have peering set difference• Critical in traffic to origin
• Proximity-based routing• Converge exponentially fast• 3-15% do one more hop• Implicit overlay tree
• Result• Origin load reduction by 5x
Origin server
Rerun hashing
![Page 18: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/18.jpg)
KyoungSoo Park 18
Scale Experiments
• Use all live PlanetLab nodes as clients• 380~400 live nodes at any time• Simultaneous fetch of 50MB file
• Test scenarios• Direct• BitTorrent Total/Core• CoBlitz uncached/cached/staggered• Out-of-order numbers in paper
![Page 19: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/19.jpg)
KyoungSoo Park 19
Throughput Distribution
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2000 4000 6000 8000 10000
Throughput(Kbps)
Fra
cti
on
of
Nod
es <
= X
(C
DF)
Direct
BT - total
BT - core
In - order uncached
In - order staggered
In - order cached
55-86%Out-of-order staggered
BT-Core
![Page 20: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/20.jpg)
KyoungSoo Park 20
Downloading Times
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Download Time (sec)
Fra
cti
on
of
Nod
es <
= X In-order cached
In-order staggered
In-order uncached
BT-core
BT-total
Direct
95% percentile: 1000+ secs faster
![Page 21: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/21.jpg)
KyoungSoo Park 21
Why Is BitTorrent Slow?
• In the experiments• No locality – randomly choose peers• Chunk indexing – extra communication
• Trackerless BitTorrent – Kademlia DHT
• In practice• Upload capacity of typical peers is low
• 10 to a few 100 Kbps for cable/DSL users• Tit for tat may not be fair
• A few high-capacity uploaders help the most
• BitTyrant[NSDI’07]
![Page 22: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/22.jpg)
KyoungSoo Park 22
Synchronized Workload Congestion
Origin Server
![Page 23: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/23.jpg)
KyoungSoo Park 23
Addressing Congestion
• Proximity-based multi-hop routing• Overlay tree for each chunk
• Dynamic chunk-window resizing• Increase by 1/log(x), (where x is win
size) if chunk finishes < average• Decrease by 1 if retry kills the first
chunk
![Page 24: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/24.jpg)
KyoungSoo Park 24
Number of Failures
4.3
5.7
2.1
0
1
2
3
4
5
6
Direct BitTorrent CoBlitz
Fai
lure
Per
cent
age(
%)
![Page 25: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/25.jpg)
KyoungSoo Park 25
Performance After Flash Crowds
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5000 10000 15000 20000 25000 30000 35000
Throughput(Kbps)
Fra
cti
on o
f N
odes >
X
BitTorrent
In- order CoBlitz
BitTorrent: 20% > 5Mbps
CoBlitz:70+% > 5Mbps
![Page 26: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/26.jpg)
KyoungSoo Park 26
Data Reuse
7 fetches for 400 nodes, 98% cache hit
7.7
35
55
0
10
20
30
40
50
60
Shark BitTorrent CoBlitz
Uti
lity
(# o
f n
odes
ser
ved
/ co
py)
![Page 27: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/27.jpg)
KyoungSoo Park 27
Real-world Usage
• 1-2 Terabytes/day• Fedora Core official mirror
• US-East/West, England, Germany, Korea, Japan
• CiteSeer repository (50,000+ links)• University Channel (podcast/video)• Public lecture distribution by PU OIT• Popular game patch distribution• PlanetLab researchers
• Stork(U of Arizona) + ~10 others
![Page 28: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/28.jpg)
KyoungSoo Park 28
Fedora Core 6 Release
• October 24th, 2006• Peak Throughput
1.44Gbps Release point 10am
1 G
Origin Server30-40Mbps
![Page 29: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/29.jpg)
KyoungSoo Park 29
On Fedora Core Mirror List
• Many people complained about I/O• Performing peak 500Mbps out of
2Gbps• 2 Sun x4200 w/Dual Operons, 2G mem• 2.5 TB Sata-based SAN• All ISOs in disk cache or in-memoy FS
• CoBlitz uses 100MB mem per node• Many PL node disks are IDEs• Most nodes are BW capped at 10Mpbs
![Page 30: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/30.jpg)
KyoungSoo Park 30
Conclusion
• Scalable large-file transfer service• Evolution under real traffic
• Up and running 24/7 for over 2.5 years• Unilateral peering, multi-hop routing,
window size adjustment• Better performance than P2P
• Better throughput, download time• Far less origin traffic
![Page 31: CoBlitz: A Scalable Large-file Transfer Service (COS 461)](https://reader036.vdocument.in/reader036/viewer/2022070411/56814834550346895db554ce/html5/thumbnails/31.jpg)
KyoungSoo Park 31
Thank you!
More information:http://codeen.cs.princeton.edu/
coblitz/
How to use:http://coblitz.codeen.org:3125/URL**Some content restrictions apply See Web site for details Contact me if you want full access!