predicting performance of applications and infrastructures tania lorido 27th may 2011
TRANSCRIPT
![Page 1: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/1.jpg)
Predicting performance of applications and
infrastructuresTania Lorido
27th May 2011
![Page 2: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/2.jpg)
Problem definition
Objective Predicting utilization of resources (memory, CPU, ...) on
different computing systems in order to determine application behavior. To predict performance if the available resources change To change available resources in elastic infrastructures
Three scenarios Benchmark traces on a simulator (INSEE)
NAS Parallel Benchmarks Real applications on real systems (Data from U. of Florida) Applications running in the cloud (Arsys)
![Page 3: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/3.jpg)
First scenario: INSEE
![Page 4: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/4.jpg)
What is INSEE?
Interconnection Network Simulation and Evaluation Environment
Input: Traces containing messages sent among nodes.
Output: Execution timeAnd many other network-related figures
![Page 5: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/5.jpg)
ObjectivesGet a dataset running several traces on the
simulator
Create different models -> execution time prediction
Learn about ML techniques
![Page 6: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/6.jpg)
Input tracesNAS Parallel Benchmark suite
Scientific codes implemented in Fortran + MPI
Can run in systems of different sizesTested with 16 or 64 tasks
Run on a real system (Kalimero-like cluster)
Captured the whole list of point-to-point messages sent between every pair of tasks.
![Page 7: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/7.jpg)
Topologies
2D mesh 2D torus
![Page 8: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/8.jpg)
We have…... a set of tasks: 16 or 64
… a set of nodes: 256 (16x16 torus)
How to assign tasks to nodes?
![Page 9: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/9.jpg)
PartitioningSelecting a set of nodes
Three options: random, band & quadrant
An example:We need 4 nodesTopology: mesh
![Page 10: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/10.jpg)
Random
Band
Quadrant
![Page 11: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/11.jpg)
MappingAssigning each task to one of the nodes in the
set
Two options: random & consecutive
Example…… with band partitioning
![Page 12: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/12.jpg)
Random
Consecutive
![Page 13: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/13.jpg)
Background noiseIn a real environment, several applications
compete for the network.
We emulate that with random messages sent among nodes: background noise
Different levels
![Page 14: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/14.jpg)
Predictive Variables
![Page 15: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/15.jpg)
Predictive Variables
![Page 16: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/16.jpg)
ExperimentA model for each trace type (7 types)
Class variable: execution time discretized in 3 binsWidth Height (equal frequency)
Classifiers: KNN, Naive Bayes, J48 tree
10 repeated, 5 cross-validation
Accuracy
![Page 17: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/17.jpg)
Results (I)
![Page 18: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/18.jpg)
Results (II)
![Page 19: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/19.jpg)
Interpretation for resultsQuite good results (80-100% of accuracy)
Background noise doesn’t affect (information gain = 0.00015)
… learning about ML techniques.
![Page 20: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/20.jpg)
Second scenario: parallel application data from the U. of
Florida
![Page 21: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/21.jpg)
What have they done?Run a couple of real applications on real systems
to obtain datasets
Apply several regression techniques to predict execution time and other parameters related to resource usage.KNN, LR, DT, SVM, …
Propose a new algorithm and compare it with “classical ones”
![Page 22: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/22.jpg)
ObjectivesRepeat the experiment – same results?
Discretize variables and apply classification techniques.
Multidimensional prediction
![Page 23: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/23.jpg)
Real applications
Bioinformatics applications:
BLAST: Basic Local Alignment Search Tool
RAxML: Randomized Axelerated Maximum Likelihood
![Page 24: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/24.jpg)
… running on real systems
![Page 25: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/25.jpg)
Datasets are availableBLAST RAxML
6592 data points
Two class variablesExecution time
(seconds)Output size (bytes)
487 data points
Two class variablesExecution time
(seconds)Resident Set Size,
RSS (bytes)
![Page 26: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/26.jpg)
Predictive variables - RAxML
![Page 27: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/27.jpg)
Attribute selectionDifferent sets chosen by the authors
![Page 28: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/28.jpg)
Testing different classifiers…
![Page 29: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/29.jpg)
First experiment - Regression
10 repeated, 10 cross-validation
Classifier evaluation:
Percentage error
where fi = forecast value, ai = actual value
Mean Percentage Error
![Page 30: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/30.jpg)
Results
![Page 31: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/31.jpg)
Second experiment – Classification
Output variable discretized in 4 binsWidthHeight (equal frequency)
Predictive variables discretized applying Fayyad IraniMakes groups trying to minimize entropy
Same classifiers, except Linear Regression and SVM
Classifier evaluation criterion: Accuracy
![Page 32: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/32.jpg)
Results
![Page 33: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/33.jpg)
Interpretation Height-based discretization:
65 – 75% accuracy
Width-based discretization92 – 96% accuracy … BUT…
![Page 34: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/34.jpg)
Attribute selection Information gain with respect to the class is 0 (or close
to) for some variables
Previous attribute selection is done based on author criterion
So… we apply:
Attribute Evaluator: CfsSubsetEval
Search Method: BestFirst
And the results….
![Page 35: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/35.jpg)
![Page 36: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/36.jpg)
ConclusionsRegression experiment repeated with the same
results
Width-based discretization discarded
“Same results” after attribute selection
And next…
Multidimensional prediction: BLAST: Execution time & output size RAxML: Execution time & memory size (RSS)
![Page 37: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/37.jpg)
Third scenario: prediction of
resource demands in cloud computing
This is future work
![Page 38: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/38.jpg)
What does Arsys offer? (I)Traditional application and web hosting
An IaaS cloud computing platform
![Page 39: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/39.jpg)
What does Arsys offer? (II)A tool for the client to create and manage his
own VMs:RAMNumber of coresDisk space
Theoretically, no limits in resource usage
Resources can be changed dynamically Elasticity
![Page 40: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/40.jpg)
What do they want?
A tool that:Monitors resource utilization by a user’s
VM…… and predicts future utilization to…… proactively modify resource
reservations…… to optimize application performance…… and cost
Initially we will focus on the prediction part
![Page 41: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/41.jpg)
Variables to predict (an example)Used amount of RAM. MB.
Used amount of SWAP. MB.
Amount of free disk space. MB.
Disk performance. KB/s
Processor load. MHz
Processor use percentage.
Network bandwidth usage. Kb/s
![Page 42: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/42.jpg)
Approaches1/0 predictions based on threshold
Will a variable reach a certain value?
Interval-based predictions
Regression
Time seriesPrediction based on trends
![Page 43: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/43.jpg)
Questions?
![Page 44: Predicting performance of applications and infrastructures Tania Lorido 27th May 2011](https://reader036.vdocument.in/reader036/viewer/2022070408/56649e555503460f94b4c76e/html5/thumbnails/44.jpg)
Predicting performance of applications and
infrastructuresTania Lorido
27th May 2011