![Page 1: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/1.jpg)
Volley:Automated Data Placement for Geo-Distributed Cloud
Services
Authors:Sharad Agarwal, John Dunagen, Navendu Jain, Stefan Saroiu, Alec Wolman, Harbinder Bogan
7th USENIX Symposium on Networked Systems Design and Implementation (NSDI 2010)
Martin BeyerDecember 14, 2010
1 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 2: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/2.jpg)
Introduction
Cloud ServicesUsers from all continents want to collaborate through cloud services
Do not accept high latencies
Cloud services deal with highly dynamic data (e.g. Facebook wall)
Placement of user & application data
2 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 3: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/3.jpg)
Introduction
Worldwide DistributionServe all users from the best datacenter (DC) with respect to userperceived latency
Cloud service providers use many geographically dispersed DCs
What data to store at which datacenter?
Interdependencies between data itemsMinimize operational cost of the datacenters
Inter-DC traffic due to data sharing or interdependenciesProvisioned capacity at each DC
3 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 4: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/4.jpg)
Introduction
ReplicationData replication for fault-tolerance
Hardware failuresNatural disasters
Replication for availabilityLarge scale outages
No single point of failureReplicas need to communicate frequently
SynchronizationEnsure consistency
4 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 5: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/5.jpg)
Introduction
Impacts of Data PlacementLatency increases between distant locations→ Move data near the users that most frequently access it
Amount of inter-DC traffic influences bandwidth costs→ Colocate data items
Capacity skew among DCs increases hardware costs→ Uniformly distribute among DCs
5 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 6: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/6.jpg)
Introduction
Approaches to Data PlacementHow to find a good data placement that reduces latency andoperational cost?Full replication at each datacenter
Lowest latency for the usersExcessive costs for DC operators
Single DC holds all dataNo inter-DC trafficMany unhappy users due to high latency
Partition data across multiple DCsChallenging problem to find good placementNeed to analyze patterns of data accessProcess � 108 objects
6 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 7: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/7.jpg)
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
![Page 8: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/8.jpg)
Analysis of Cloud Services
Challenges of Data PlacementCloud services deal with highly dynamic data
High update rates lead to stale replicasUpdates need to be visible worldwide
Collaboration around the worldUsers work together on a shared data item
Data interdependenciesPublish-Subscribe mechanisms; “Friend of a friend”Can be modeled as dependency graph
Generate huge data setsNeed solutions for efficient analysis of the dependency graph
8 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 9: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/9.jpg)
Analysis of Cloud Services
Challenges of Data Placement (cont’d)Applications change frequently
Need to continuously adapt to changing usage patterns
Increasing user mobilityWhen should data be migrated to new location?
Infrastructure can changeCapacity limits or latencies between DCs may change
9 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 10: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/10.jpg)
Analysis of Cloud Services
Network TracesDatacenter applications collect workload traces
Month-long log from Live Mesh and Live MessengerAnalysis focuses on the aspects of
Shared dataData interdependenciesUser mobility
10 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 11: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/11.jpg)
Analysis of Cloud Services
Live MeshFile & Application synchronization
Cloud storage
Data feeds
11 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 12: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/12.jpg)
Analysis of Cloud Services
Live MessengerInstant Messaging
Video conferencing
Continuous group conversation
Contact status updates
12 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 13: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/13.jpg)
Client 1 Client 2
FacebookRSS Feed 1
FacebookWall 1
FacebookWall 2
FacebookRSS Feed 2
Analysis of Cloud Services
FacebookFacebook wall
Connects users to all of their friends
Users can receive updates via RSS feeds
Interdependencies between walls and RSS feeds
13 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 14: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/14.jpg)
Client 1 Client 2
Frontend Frontend
Queue
Pub-Sub
DeviceConnectivity
Analysis of Cloud Services
Data Sharing in Live MeshClients access Live Mesh through Web frontend
Update to device connectivity status
Multiple queue items can subscribe to publish-subscribe object
14 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 15: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/15.jpg)
Analysis of Cloud Services
Data Interdependencies in Live MeshChange to a document creates an update message atPublish-Subscribe objects
Queue objects receive a copy of that message
Long tail of very popular data items
15 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 16: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/16.jpg)
Analysis of Cloud Services
Geographically Distant Data SharingCompute sharing centroid for each data item
Weighted mean between the users that access it
Large amount of sharing occurs between distant clients
16 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 17: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/17.jpg)
Analysis of Cloud Services
Client MobilityGeo-location database quova.com
Maps IP address to geographic region
Centroid computed from all locations where the client contacted theservice
Large movements in the Live Messenger trace
17 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 18: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/18.jpg)
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
![Page 19: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/19.jpg)
Data Placement
Known HeuristicsDetermine user locationMove data to closest datacenter for that user
with the goal to reduce user latency
Ignores major sources of operational costsWAN bandwidth between DCsOverprovisioned datacenter capacity due to skewed load
19 / 45Volley: Automated Data Placement for Geo-Distributed Cloud ServicesN
![Page 20: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/20.jpg)
Data Placement
Volley’s ApproachVolley optimizes data placement for latency
and allows to limit operational costs
Correlates application logs into graph that captures a global viewon data accesses
Analyzes data interdependencies and user behavior within cloudservices
Compute data placement and output recommendations when datashould be migrated
20 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 21: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/21.jpg)
Capacity ModelCost Model Latency Model
Constraints on Placement
ProposalVOLLEY
DistributedStorage
DC 3
DC 1DC 2 LOG
Data Placement
Volley in a NutshellInput: logs & models
Datacenter logs in distributed storage systemModels for cost, capacity and latencyConstraints on placement
Iterative optimization algorithmDistributed computing framework
Output: migration recommendations
21 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 22: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/22.jpg)
Data Placement
Requirements for LogsCapture logical flow of control across components→ Construct dependency graph
Provide unique identifiers fordata items: GUIDusers: IP
Request log record:TimestampSource-entity: IP or GUIDRequest sizeDestination entity: GUIDTransaction ID: trace request in logs
22 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 23: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/23.jpg)
Data Placement
Logged EventsLive Mesh Trace
Changes to filesDevice connectivity
Live Messenger TraceLogin/Logoff eventsParticipants in each conversationNumber of messages between users
23 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 24: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/24.jpg)
Data Placement
Datacenter Cost ModelCost per transaction, such as RAM, disk and CPU
Capacity model for all DCs, e.g. amount of data stored at each DC
Cost model for all DCs
Models change on slower time scales
→ Specify the hardware provisioning in DCs to run the service
→ Required network bandwidth
→ Charging model for service use of network bandwidth
24 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 25: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/25.jpg)
Data Placement
Additional InputsLocation of each data itemModel of latency
Network coordinate system: n-dimensional space specified by themodelLocations of nodes → predicted latency
Constraints on placementReplication at distant datacentersLegal constraints
→ Allows to make placement decisions
25 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 26: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/26.jpg)
Data Placement
AlgorithmPhase 1 Compute initial placementPhase 2 Iteratively move data to reduce latencyPhase 3 Collapse data to datacenters
26 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 27: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/27.jpg)
Data Placement
Phase 1: Initial PlacementMap data items to the weighted average of the geographiccoordinates of the clients that access it
Weight = amount of communication client↔data item∀ data items: compute weighted spherical mean
Interpolate between 2 initial points (clients)Average in additional points
Some data items may never be accessed directly by a clientMove them near the already fixed data items
Ignores data interdependencies!
27 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 28: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/28.jpg)
Data Placement
Phase 1: Initial Placement
28 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 29: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/29.jpg)
Data Placement
Phase 2: Iteratively Improve PlacementMove data items closer to users and other data items thatfrequently interact∀ data items: determine movement to another node
Current latency and amount of communication increases thecontracting force
Updates to placement pull nodes togetherData items moveableClient locations fixed
Replicas treated as separate data items that interact frequently
→ Reduce latency
→ Reduce inter-DC traffic (if data items colocated)
29 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 30: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/30.jpg)
Data Placement
Phase 3: Collapse Data to DatacentersMove data to nearest datacenterIf DC over specified capacity
Identify data objects with fewest accessesMove them to the next closest DC
Iterations ≤ #DCs
30 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 31: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/31.jpg)
Data Placement
Output: Migration ProposalsApplication-specific migration
Supports diverse datacenter applications
Proposal record:Entity: GUIDNew datacenterAverage latency change per requestOngoing bandwidth change per dayOne-time migration bandwidth
31 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 32: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/32.jpg)
1 Introduction
2 Analysis of Cloud Services
3 Data Placement
4 Evaluation
5 Conclusion
![Page 33: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/33.jpg)
Evaluation
Test EnvironmentMonth-long Live Mesh trace
Compute placement on week 1Evaluate placement on weeks 2-4
12 datacenters as potential locations
Capacity limit: ≤ 10% of all data at each DC
Analytic evaluation using network coordinate system
33 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 34: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/34.jpg)
Evaluation
HeuristicscommonIP
Place data near IP with most frequent accessOptimizes for latency
oneDCPlace all data in one datacenterOptimizes for zero inter-DC traffic
hashPlace data according to hash functionOptimizes for zero capacity skew
34 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 35: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/35.jpg)
Evaluation
Capacity Skew & Inter-DC Traffic
35 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 36: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/36.jpg)
Evaluation
LatencyVolley performs better than commonIP and provides
lower capacity skewfewer inter-DC messages
36 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 37: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/37.jpg)
20 VMs12 DCs
109 Client nodes
Evaluation
Live System LatencyLive Mesh prototype
Frontend: allows clients to connect to any DCDocument Service: stores IP addresses of the clientsPublish-Subscribe Service: notifies about changes in the documentserviceMessage Queue Service: buffers messages from thePublish-Subscribe Service
Live Mesh trace replayed from 109 nodes scattered around the world
37 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 38: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/38.jpg)
Evaluation
Live System LatencyExternal sources of noise
Less client locations than real world scenarioConnectivity of the simulated clients does not conform to Volley’slatency model
38 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 39: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/39.jpg)
Evaluation
Impact of Iteration Count on CapacitySkew
Most objects do not move after phase 1
Capacity skew smoothed in phase 3
39 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 40: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/40.jpg)
Evaluation
Impact of Iteration Count on Client La-tency
Latency remains stable after few iterations of phase 2
Almost no penalty from phase 3
40 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 41: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/41.jpg)
Evaluation
Re-ComputationVolley should be re-run frequently
Stale placements increase request latency due to client mobilityInter-datacenter traffic increases due to new objects that cannot beplaced intelligentlyChanging access patterns may require data movementNew clients need to be served
41 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 42: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/42.jpg)
Evaluation
Migrated ObjectsPercentage of objects moved in placement computed after week Xcompared to first week
Most old objects do not move
42 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 43: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/43.jpg)
Conclusion
SummaryAutomatic recommendations for data-placement underconstraintsPlacement can be controlled to
take resource usage into account (Cost & Capacity Models)ensure replication (Constraints on Placement)
Application independence allowing for specialized migrationmechanisms
Analysis of cloud services hightlighted the trends that motivatedVolley: shared data, data interdependencies and user mobilityEvaluation shows that Volley simultaneously reduces latency andoperational costs
Improvement over state-of-the-art heuristic
43 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 44: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/44.jpg)
Conclusion
Open QuestionsVolley handles placement decisions within cloud service
Extension to output recommendations to DC operators toupgrade their DCs or build new ones
Can we allow new objects to be registered such that they get agood initial placement?Volley handles replicas as separate data items
Better alternative for modeling replicas?
44 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N
![Page 45: Volley: Automated Data Placement for Geo-Distributed Cloud ...resources.mpi-inf.mpg.de/d5/teaching/ws10_11/cc/presentations/20… · Volley: Automated Data Placement for Geo-Distributed](https://reader030.vdocument.in/reader030/viewer/2022041106/5f08ae6c7e708231d42334d4/html5/thumbnails/45.jpg)
Thanks for your attention
45 / 45Volley: Automated Data Placement for Geo-Distributed Cloud Services
N