monitoring latency sensitive enterprise applications on the cloud shankar narayanan ashiwan...
TRANSCRIPT
![Page 1: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/1.jpg)
Monitoring Latency Sensitive Enterprise Applications on the Cloud
Shankar NarayananAshiwan Sivakumar
![Page 2: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/2.jpg)
2
Enterprise Applications (EA)Stock Trader Benchmark Application
Data Base (DB)
Business Service (BS)Front End (FE)
Configuration Service (CS)
Order Processing Service (OS)
![Page 3: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/3.jpg)
3
EA as Services
FE
Users
FE
BS
BS
BS
BS
BS
OS
OS
OS
DB
DB
Load Balancers
Service Endpoints
![Page 4: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/4.jpg)
4
EA Characteristics
Notice: Dynamic and distributed nature of cloud deployments.
Reducing user observed latency is the goal – Monitor this !
EA property Relevant cloud characteristic
Scalability Dynamic deployment sizes
Availability geo-redundancy
Economics Pay-as-you-use
Elasticity Decoupled services
Low latency Deploy closer to user groups
Utilization Load balancing
![Page 5: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/5.jpg)
5
Performance Variation: Time Series and CDF of DB Latency
- data snapshot worth 4 hours across both the days
![Page 6: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/6.jpg)
6
Monitoring Framework – Design Goals
Resilience: Less sensitive to cloud variabilityScalability: Capable of scaling with component
instancesPortability: Easy to integrate with applicationsFlexibility: Multiple levels of measurement
User level latencyComponent level isolation
Efficiency: Fast and accurate measurements
![Page 7: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/7.jpg)
7
Why is Monitoring Hard Dynamic environment – number of components change
Distributed deployment - needs a collection framework
Variable request path – different choice of components
Existing monitoring tools
Do not support service oriented architectures
Too detailed
Not scalable
Remember: user observed latency is our goal Abstract away un-necessary details !
![Page 8: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/8.jpg)
8
Measuring End-points – Existing Tools
• FE • BS • DB
• Users
• 1• 2 • 3
• 5• 4• 7 • 6• 1
1• 1
0
• 9• 8• 1
2• 1
3
• HTTP Request
• SOAP Response
• HTTP Response
• MySQL Replies
Aggregate !!
![Page 9: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/9.jpg)
9
Measurement Model
Ti,i+1
C i + 1 C i + 2C i
Ti-1,i
Ti,i+1
Ti+1,i+2 Ti+1,i+2
T’i+1,i+2 Ti+1,i+2
Ti,i+2
T’i,i+2
T’’i,i+2
Ti,i+2
Ti,i+2
Ti,i+2
T’i,i+1 Ti,i+1 Ti+1,i+2 Ti+1,i+2
T’’’i,i+2 Ti,i+2
T’’’’i,i+2 Ti,i+2
CLi = Component latency of ith component
LLi,i+1 = Link latency across components i, i+1
N = No of components Ci
communicates withnj = No of calls made by Ci to each of
the j components
![Page 10: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/10.jpg)
10
Notification Q
Instrumented application component
Log server (local)
Raw logStorage (local)
Global collector
Instrumented application component
Log server (local)
Raw logStorage (local)
Aggregated log
Aggregated log
Monitoring Framework Architecture
![Page 11: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/11.jpg)
11
Outline
• Monitoring tool– Collection framework– Instrumentation framework
![Page 12: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/12.jpg)
12
The Collection Framework
Each component writes to local storage Front-end sends “done” message to local queue Queues: decouple producer, consumer entities Storage: persistence, no limit on size Both: scalable, robust
Question: Why this a right model ?When in doubt, measure!
![Page 13: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/13.jpg)
13
Alternative Model
All components write to queue Collection framework de-queues
Forms a P2P network to collate the data
![Page 14: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/14.jpg)
14
Experiments on Azure and EC2
• Experiments evaluating performance of storage and queues.
• Real cloud deployments (Microsoft Azure, Amazon AWS)
• Extensive measurements from all data-centers US (East/West/North/South)Europe (West/Central)Asia (East/South East)
![Page 15: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/15.jpg)
15
Performance of Storage and Queues
Microsoft Azure Amazon AWS
•Measurements made in all 12 datacenter regions (Azure and AWS)•Experiment length (24 – 26 hours) •Approx 100,000 requests to storage 16,000 requests to the queues
Write Q
Read Q
Read Q
Write Q
Write Store
Write Store
![Page 16: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/16.jpg)
16
Outline
• Monitoring tool– Collection framework– Instrumentation framework
![Page 17: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/17.jpg)
17
Instrumentation Framework - Goals
• Minimize coding effort and intervention• Measure latency at the granularity of user
request• Automate instrumentation as much as
possible• Generate minimal measurement parameters
![Page 18: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/18.jpg)
18
Comparison of Existing Tools
![Page 19: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/19.jpg)
19
Instrumentation Framework
Instrumented Application Component
Original ApplicationComponent
Aspects
Specification for the application end- points (X-trace: log events)
Measurement metric specification
(X-trace: meta-data)Log Format
specifications
![Page 20: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/20.jpg)
20
Experiment Set-up• Deployed two similar benchmark applications
• DayTrader - Amazon AWS • StockTrader - Windows Azure (prior work)
• Deployed the collection framework on AWS and Azure.
• User sessions and request patterns from DaCapo benchmark suite.
• Instrumentation:• Automated using aspects – DayTrader (AWS)• Custom coded - DayTrader and StockTrader
![Page 21: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/21.jpg)
21
Aggregation Benefit: DayTraderUser request
typeStorage writes without
aggregationStorage writes
with aggregation
FE BS FE BS
Login 3 5 1 1
Portfolio 10 10 1 1
Update profile 4 5 1 1
Home 2 2 1 1
Buy 1 7 1 1
Sell 1 8 1 1
Account 3 3 1 1
Total 24 40 7 7
• User sessions : 20 , 1 every 10 seconds• Results shown for a random user from DaCapo
78% writes reduced in above case transactions benefits
![Page 22: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/22.jpg)
22
Aggregation Benefit: MedRec Application Suite
Application Storage writes without aggregation
Storage writes with aggregation
FE BS FE BS
MedRec App 4 8 1 1Physician App 8 15 1 1
Admin App 2 5 1 1
• Storage writes reduced by at least 50% from FE, 80% from BS
![Page 23: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/23.jpg)
23
Instrumentation Benefit
Category Code (# of files)Handcrafted
Code (# of files)X-Trace with Aspect
same 15250 (88) 15250 (92)
modified 593 (74) 465 (70)
added 878 (0) 166 (2)
automatable 0 (0) 166 (2)
• FE component code : automatable using aspects with x-trace• Cross component calls : x-trace object passed as parameter
• New lines of code reduced by ~80%• SLOC reduced by ~20%• Aspects can be automated
![Page 24: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/24.jpg)
24
Future Work
• Scaling the framework • Application scale to Framework scale ratio• Per Datacenter ? Per VM ? Varies per cloud
provider ?• Impact of these design decisions on the sensitivity of
the framework
![Page 25: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/25.jpg)
25
Conclusions• Architectural benefits:
• Generic across - application, # of components, access patterns
• Scalable – decoupled entities• Aggregation benefits:
• N writes to storage becomes one write• Log server offloads work from application
• Instrumentation benefits:• Easy to integrate with application• New lines of code reduced by ~80%• SLOC reduced by ~20%
![Page 26: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/26.jpg)
26
Q & A
![Page 27: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/27.jpg)
27
Back up slides
![Page 28: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/28.jpg)
28
Azure Blob Read and Write Latency
Blob read-write at least30-40 msec
![Page 29: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/29.jpg)
29
Azure Queue Read and Write Latency
Queue read costly,write comparable to blob
![Page 30: Monitoring Latency Sensitive Enterprise Applications on the Cloud Shankar Narayanan Ashiwan Sivakumar](https://reader036.vdocument.in/reader036/viewer/2022062801/56649e5c5503460f94b54a95/html5/thumbnails/30.jpg)
30
SQL Azure Performance Issue Snapshot (6 Days)