replica placement strategy for wide-area storage systems byung-gon chun and hakim weatherspoon rads...

Replica Placement Strategy for Wide-Area Storage Systems

Byung-Gon Chun and Hakim Weatherspoon

RADS Final PresentationDecember 9, 2004

Final Presentation:2

Environment• Store large quantities of data persistently and

availably• Storage Strategy

– Redundancy - duplicate data to protect against data loss– Place data throughout wide area for availability and durability

• Avoid correlated failures– Continuously repair loss redundancy as needed

• Detect permanent node failures and trigger data recovery


Assumptions• Data is maintained on nodes, in the wide area, and

in well maintained sites.• Sites contribute resources

– Nodes (storage, cpu)– Network - bandwidth

• Nodes collectively maintain data– Adaptive - Constant change, Self-organizing, self-

maintaining

• Costs– Data Recovery

• Process of maintaining data availability– Limit wide area bandwidth used to maintain data


Challenge

• Avoiding correlated failures/downtime with careful data placement– Minimize cost of resources used to maintain data

• Storage• Bandwidth

– Maximize• Data availability


Outline• Analysis of correlated failures

– Show that correlated failures exist - are significant

• Effects of common subnet (admin area, geographic location, etc)– Pick a threshold and extra redundancy

• Effects of extra redundancy– Vary extra redundancy– Compare random, random w/ constraint, and oracle

placement– Show that margin between oracle and random is small


Analysis of PlanetLab Trace characteristics

• Trace-driven simulation• Model maintaining data on PlanetLab• Create trace using all-pairs ping*

– Collected from February 16, 2003 to October 6, 2004

• Measure– Correlated failures v. time– Probability of k nodes down simultaneously– {5th Percentile, Median} number of available replicas v. time– Cumulative number of triggered data recovery v. time

*Jeremy Stribling http://infospect.planet-lab.org/pings


Analysis of PlanetLab II Correlated failures


Analysis I - Node characteristics


Analysis II- Correlated Failures


Correlated Failures


Correlated Failures (machine with downtime <= 1000 slots)


Availability Trace


Replica Placement Strategies

• Random• RandomSite

– Avoid to place multiple replicas in the same site– A site in PlanetLab is identified by 2B IP address prefix.

• RandomBlacklist– Avoid to use machines, in blacklist, that are top k

machines with long down time

• RandomSiteBlacklist– Combine RandomSite and RandomBlacklist


Comparison of simple strategies(m=1, th=9, n=14, |blacklist|

=35)

Strategy Random RandomSite

RandomBlacklist

RandomSiteBlacklist

# of repairs

9075 8581 8691 8160

Improvement (%)

5.44 4.23 10.08


Simulation setup• Placement Algorithm

– Random vs. Oracle – Oracle strategies

• Max-Lifetime-Availability• Min-Max-TTR, Min-Sum-TTR, Min-Mean-TTR

• Simulation Parameters– Replication m = 1, threshold th = 9, total replicas n = 15– Initial repository size 2TB– Write rate 1Kbps per node and 10Kbps per node

• 300 storage nodes• System increases in size at rate of 3TB and 30TB per year,

respective.

• Metrics– Number of available nodes– Number of data repairs


Comparison of simple strategies(m=1, th=9)


Results - Random Placement(1Kbps)


Results - Oracle Max-Lifetime-Avail

(1Kbps)


Results - Breakdown of Random (1Kbps)


Results - Random(10Kbps)


Results - Breakdown of Random (10Kbps)


Conclusion

• There does exist correlated downtimes. • Random is sufficient

– A minimum data availability threshold and extra redundancy is sufficient to absorb most correlation.

replica placement strategy for wide-area storage systems byung-gon chun and hakim weatherspoon rads...

Documents

data slide

data availability slide

correlated failures

data loss place data

assumptions data

trigger data recovery

data storage bandwidth

careful data placement