replica placement strategy for wide-area storage systems byung-gon chun and hakim weatherspoon rads...
Post on 22-Dec-2015
212 views
TRANSCRIPT
Replica Placement Strategy for Wide-Area Storage Systems
Byung-Gon Chun and Hakim Weatherspoon
RADS Final PresentationDecember 9, 2004
Final Presentation:2
Environment• Store large quantities of data persistently and
availably• Storage Strategy
– Redundancy - duplicate data to protect against data loss– Place data throughout wide area for availability and durability
• Avoid correlated failures– Continuously repair loss redundancy as needed
• Detect permanent node failures and trigger data recovery
Final Presentation:3
Assumptions• Data is maintained on nodes, in the wide area, and
in well maintained sites.• Sites contribute resources
– Nodes (storage, cpu)– Network - bandwidth
• Nodes collectively maintain data– Adaptive - Constant change, Self-organizing, self-
maintaining
• Costs– Data Recovery
• Process of maintaining data availability– Limit wide area bandwidth used to maintain data
Final Presentation:4
Challenge
• Avoiding correlated failures/downtime with careful data placement– Minimize cost of resources used to maintain data
• Storage• Bandwidth
– Maximize• Data availability
Final Presentation:5
Outline• Analysis of correlated failures
– Show that correlated failures exist - are significant
• Effects of common subnet (admin area, geographic location, etc)– Pick a threshold and extra redundancy
• Effects of extra redundancy– Vary extra redundancy– Compare random, random w/ constraint, and oracle
placement– Show that margin between oracle and random is small
Final Presentation:6
Analysis of PlanetLab Trace characteristics
• Trace-driven simulation• Model maintaining data on PlanetLab• Create trace using all-pairs ping*
– Collected from February 16, 2003 to October 6, 2004
• Measure– Correlated failures v. time– Probability of k nodes down simultaneously– {5th Percentile, Median} number of available replicas v. time– Cumulative number of triggered data recovery v. time
*Jeremy Stribling http://infospect.planet-lab.org/pings
Final Presentation:7
Analysis of PlanetLab II Correlated failures
Final Presentation:8
Analysis I - Node characteristics
Final Presentation:9
Analysis II- Correlated Failures
Final Presentation:10
Correlated Failures
Final Presentation:11
Correlated Failures (machine with downtime <= 1000 slots)
Final Presentation:12
Availability Trace
Final Presentation:13
Replica Placement Strategies
• Random• RandomSite
– Avoid to place multiple replicas in the same site– A site in PlanetLab is identified by 2B IP address prefix.
• RandomBlacklist– Avoid to use machines, in blacklist, that are top k
machines with long down time
• RandomSiteBlacklist– Combine RandomSite and RandomBlacklist
Final Presentation:14
Comparison of simple strategies(m=1, th=9, n=14, |blacklist|
=35)
Strategy Random RandomSite
RandomBlacklist
RandomSiteBlacklist
# of repairs
9075 8581 8691 8160
Improvement (%)
5.44 4.23 10.08
Final Presentation:15
Simulation setup• Placement Algorithm
– Random vs. Oracle – Oracle strategies
• Max-Lifetime-Availability• Min-Max-TTR, Min-Sum-TTR, Min-Mean-TTR
• Simulation Parameters– Replication m = 1, threshold th = 9, total replicas n = 15– Initial repository size 2TB– Write rate 1Kbps per node and 10Kbps per node
• 300 storage nodes• System increases in size at rate of 3TB and 30TB per year,
respective.
• Metrics– Number of available nodes– Number of data repairs
Final Presentation:16
Comparison of simple strategies(m=1, th=9)
Final Presentation:17
Results - Random Placement(1Kbps)
Final Presentation:18
Results - Oracle Max-Lifetime-Avail
(1Kbps)
Final Presentation:19
Results - Breakdown of Random (1Kbps)
Final Presentation:20
Results - Random(10Kbps)
Final Presentation:21
Results - Breakdown of Random (10Kbps)
Final Presentation:22
Conclusion
• There does exist correlated downtimes. • Random is sufficient
– A minimum data availability threshold and extra redundancy is sufficient to absorb most correlation.