modeling and optimization in virtualized multi-tier oracle

18
Virtualization Optimization M M ODELING ODELING AND AND O O PTIMIZATION PTIMIZATION IN IN V V IRTUALIZED IRTUALIZED M M ULTI ULTI -T -T IER IER O O RACLE RACLE E E NVIRONMENTS NVIRONMENTS Boris Zibitsker, Alex Lupersolsky, BEZ Systems ABSTRACT Virtualization adds complexity to planning and management of the multi-tier distributed environment. In order to avoid surprises you would like to know what the impact of server consolidation and implementation of virtualization on the performance of your applications. Analytical queueing network models can be used to evaluate capacity planning, performance tuning and workload management options, provide reasonable expectations and justify proactive performance management decisions. We will review how prediction results can be used to set realistic Service Level Objectives, find the most effective solutions, set expectations, verify the results and organize a proactive Service Level Management. We will discuss how performance prediction can be used to find the right candidates for virtualization, justify hardware and software upgrades for application tier and DBMS tier, optimize VM migration, predict the impact of new VMs and new workload implementation, set optimum level of concurrency, priorities, and resource allocation for each workload to support critical workloads’ SLOs with minimum cost. VIRTUALIZATION IN MULTI-TIER ORACLE ENVIRONMENT Virtualization can reduce cost, but hypervisor overhead can negatively affect performance. Not all applications are good candidates for virtualization. For example, applications with high I/O rate can have significant performance degradation after virtualization. Hypervisor overhead that depends on the number of VMs and workload parameters can affect performance of all applications. Workload and database size growth, implementation of the new applications, adding new VMs can increase hypervisor’s CPU, memory and I/O overhead and can negatively affect applications performance. We will review how modeling and optimization technology can be used to evaluate options and justify strategic capacity planning, tactical performance management and operational workload management decisions, verify results and enable organization of continuous proactive performance management. Response time in the multi-tier environment includes service time and queueing time for CPU, I/O and interprocessor communication in application servers and 1 Paper #415

Upload: cameroon45

Post on 25-May-2015

528 views

Category:

Technology


5 download

TRANSCRIPT

Page 1: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

MMODELINGODELING ANDAND O OPTIMIZATIONPTIMIZATION ININ V VIRTUALIZEDIRTUALIZED M MULTIULTI--TTIERIER O ORACLERACLE E ENVIRONMENTSNVIRONMENTS

Boris Zibitsker, Alex Lupersolsky, BEZ Systems

ABSTRACT Virtualization adds complexity to planning and management of the multi-tier distributed environment. In order to avoid surprises you would like to know what the impact of server consolidation and implementation of virtualization on the performance of your applications. Analytical queueing network models can be used to evaluate capacity planning, performance tuning and workload management options, provide reasonable expectations and justify proactive performance management decisions. We will review how prediction results can be used to set realistic Service Level Objectives, find the most effective solutions, set expectations, verify the results and organize a proactive Service Level Management. We will discuss how performance prediction can be used to find the right candidates for virtualization, justify hardware and software upgrades for application tier and DBMS tier, optimize VM migration, predict the impact of new VMs and new workload implementation, set optimum level of concurrency, priorities, and resource allocation for each workload to support critical workloads’ SLOs with minimum cost.

VIRTUALIZATION IN MULTI-TIER ORACLE ENVIRONMENT Virtualization can reduce cost, but hypervisor overhead can negatively affect performance. Not all applications are good candidates for virtualization. For example, applications with high I/O rate can have significant performance degradation after virtualization. Hypervisor overhead that depends on the number of VMs and workload parameters can affect performance of all applications. Workload and database size growth, implementation of the new applications, adding new VMs can increase hypervisor’s CPU, memory and I/O overhead and can negatively affect applications performance. We will review how modeling and optimization technology can be used to evaluate options and justify strategic capacity planning, tactical performance management and operational workload management decisions, verify results and enable organization of continuous proactive performance management.Response time in the multi-tier environment includes service time and queueing time for CPU, I/O and interprocessor communication in application servers and DBMS servers, plus different types of delays caused by limited concurrency. Workloads have different resource utilization profiles.

Complexity of requests, volume of data and processing speed affects service time. Virtualization overhead of managing VMs affects CPU service time, elongates I/O response time due to hypervisor scheduling overhead.

Increase in number of users, implementation of new applications, and concurrency limitations increase contention for resources and affect queueing time.

1 Paper #415

Page 2: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Workloads are interdependent because they compete for the shared resources, and changing the priority of one workload can improve its response time, but it can increase queueing time and response time for others.

Let’s review application of modeling and optimization on the example of the simple configuration shown on Figure 1. Application servers have unbalanced usage of resources by Java EE applications. Java EE applications generate SQL requests accessing data from Oracle DBMS servers. We will predict the impact of virtualization replacing the servers with VMs placed on one physical server. Each JVM running the application server software has limited number of execution threads and limited number of connections to the databases. JVM thread pool size and connection pool size limit the number of requests that can access DBMS concurrently.

Figure 1. An example of Server Consolidation and Virtualization. Two physical application servers are replaced by two VMs in one physical host.

In order to support workload growth and implementation of new applications, you can add a VM to an existing physical server or create a cluster of application servers and place the new VM in a new physical server. Decisions about migration of VMs between physical servers, change of a VM’s priority, tuning decisions, and change of concurrency should take into consideration that all components of the system are interdependent, and a change in one place can move a bottleneck and affect other workloads.

VIRTUALIZATION IS FOUNDATION OF CLOUD COMPUTINGVirtualization is the first step toward Cloud Computing. An individual VM can run on a desktop or can be moved to a private Cloud inside the Computer Center where VM will run on so called a “software mainframe” – shared, high availability, high performance computing platform based on distributed physical machines. For example, VMware VMotion can move a hosted operating environment from one physical machine to another. Workloads in this environment can be moved then from an in-house private Cloud to an Internet Accessible Cloud provider. New capacity can be added to the VM runtime environment when needed during high peak processing within the physical server, or VMs can be moved to a bigger physical machine using VMotion, and finally, the total physical capability of the Cloud hosting environment can be increased as well.

2 Paper #415

Page 3: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

The customer’s challenge is to decide what type of workloads should be placed on an in-house private cloud and which applications should be moved to the Internet accessible cloud provider, and which provider should be selected to support SLO with minimum cost.

The provider’s challenge is to manage concurrently different workloads, dynamically move VMs and reallocate resources depending on SLO and actual activity within VMs. New capacity can be added to the VM run time environment during high peak processing within physical server. VMs can be moved to the bigger physical machine. The total physical capability of the Cloud hosting environment can be increased as well.

Figure 2. Virtualization is a first step toward Cloud Computing. Within the data center there are traditional computing resources and private clouds. Applications can be moved back and forth to external an Cloud Host Environment. One of the challenges is to find candidates for virtualization. Another challenge is workload management and migration of VMs between physical hosts within private cloud of the data center. And finally, a decision has to be made about which applications and when they should be moved to an external cloud host environment.

CHALLENGES OF PLANNING AND MANAGING VIRTUALIZED MULTI-TIER ORACLE ENVIRONMENTS

In a multi-tier distributed environment, change of the hardware, software parameters, tuning one of the applications or databases, dynamic reallocation of resources for VMs and movement of the VMs between physical servers can affect performance of all other workloads.

The goal of strategic capacity planning, tactical performance management and operational workload management decisions is to ensure satisfaction of SLO and SLA of major workloads with minimum cost. Multiple factors should be taken into consideration, including performance requirements for each workload, growing and changing business demand, total cost of ownership, consistency of service, flexibility, scalability, etc. There is a high risk that wrong decisions will affect performance of the business applications [1].

3 Paper #415

Page 4: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

The role of modeling is to evaluate how specific decisions will affect performance, scalability and ability to support SLA of the individual workloads. Major performance requirements include response time and throughput. In a virtual environment we should take into consideration:

How CPU service time for each workload will be affected by CPU overhead by hypervisor

How I/O response time for each workload will be elongated due to delays caused by increase of the time to schedule I/O by hypervisor

How response time will be elongated due to increase of demand paging and swapping within each VM and physical server

How resource reallocation between VMs can affect performance

Predicting the impact of the expected changes on the performance of different workloads include standard steps, i.e., data collection, workload characterization, workload forecasting, performance prediction. [3] Implementation of virtualization requires collection of additional measurement data characterizing the hypervisor overhead. Workload characterization should take into consideration how hypervisor overhead affects each workload. A performance prediction should take into consideration how planned workload growth, implementation of new VMs and other planned changes will affect hypervisor overhead in the future and how it will affect the performance of individual workloads.

CHALLENGES OF WORKLOAD CHARACTERIZATION IN VIRTUALIZED MULTI- TIER ORACLE ENVIRONMENTS Any virtualization solution incurs additional overhead, and in order to model virtualization, besides the standard OS, Application server and DBMS server performance measurement data, we need measurement data characterizing the overhead [6].

VMware measurement data, for example, include Total CPU usage of the physical server, CPU System, CPU Wait, CPU Ready, CPU Used, and CPU Idle. Disk utilization statistics include Disk Usage (kb/sec), # Disk Reads and # Writes per period and Disk Read and Write Rate in #Requests/sec, Physical Device Read Latency (ms), Kernel Disk Read Latency (ms), Disk Read, and Read Queue latency (the same for Write operations). Memory usage statistics include memory usage, heap size, and swapping rate.

It represents detailed information about the usage of resources, including CPU utilization by each VM and hypervisor, detailed information about I/O, storage and memory utilization, but very important information about the number of application requests and their response times is not available. We cannot find information about the users generating requests on the VM level. This information can be obtained by extracting measurement data on the application level (JVM, DBMS, etc.) inside of VMs. JVM data obtained from JMX, Wily or other mid-tier performance monitors include the arrival rate of requests, response time and information about URL, User Names, Program Names, which can be used to segregate activity within a multi-tier environment into different business workloads.

Hypervisor overhead depends on the number of VMs and workload profile. Each VM adds significant overhead to the machine depending on the Guest OS.

Every I/O operation generated by VM waits to be dispatched by hypervisor. It can significantly elongate I/O response time for I/O intensive workloads.

The workload characterization process should take into consideration how hypervisor overhead affects the profiles of the individual workloads:

4 Paper #415

Page 5: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Additional CPU and I/O resource consumption by the host hypervisor increases contention for resources. We measure this overhead as the difference between host CPU utilization and CPU utilization by all VMs. We also calculate the difference between total host IO rate and IO rate from all VMs.

Swapping creates additional I/O rate. I/O response time elongation due to hypervisor scheduling affects the performance of all

workloads, especially workloads with high I/O rates. We measure it as the sum of the average time spent in ESX Server VMKernel per [IO] command and the average time spent in the ESX Server VMKernel queue per [IO] command.

In order to predict future hypervisor overhead, we can apply multivariance regression analysis to estimate how workload growth, increase in volume of data, implementation of new applications, server consolidation, hardware upgrade and tuning efforts will affect hypervisor overhead in the future. We can modify workload profiles during each modeling period in analytical closed queueing network models to predict how expected hypervisor overhead will affect the response time, throughput and resource consumption by each workload.

CPU (Hypervisor Ovhd), I/O response time elongation due HV scheduling and Increase of Paging with VM and increase in swapping rate as a function of Workload growth (# Parallel sessions), Volume of data increase (I/O) and increase in #VMs

CPU (Hypervisor Ovhd) = α*Wkld Growth + β*I/O + ð*#VMs + C

I/O RT Δ = αio*#ParSes + βio*I/O + ðio*#VMs + Cio

MEM (I/O) = αM*#ParSes + βM*I/O + ðM*#VMs + CM

Example of linear regression analysis applied to WMware measurement data showing the impact of workload growth on CPU Hypervisor Overhead

Figure 3. Linear Regression Analysis Showing Increase in Hypervisor CPU overhead as a result of the Workload Growth

The predicted overhead should be estimated in each prediction period using corresponding regression functions built over some time window (last month, for example).

Results of the workload characterization and workload forecasting are used as input for the model to predict the impact of planned changes on performance of individual workloads. Virtualization overhead depending on workload growth should be taken into consideration during the performance prediction process.

5 Paper #415

Page 6: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

CHALLENGES OF MODELING VIRTUALIZED MULTI-TIER ORACLE ENVIRONMENTS Closed queueing network models based on MVA algorithms can be applied to modeling Java EE applications in multi-tier distributed environments with Oracle DBMS.

The queueing network model of clusters of virtual servers with hundreds of VMs containing JVMs with limited number of JVM threads and limited connection pool sizes affecting the flow of requests between application and DBMS server can be very complex. On the other hand, the structure of each individual server models is the same. We will review use of the hierarchical modeling approach where the lower level tier is treated as an additional I/O device (like disk) for the current tier.

We apply a central server, closed queuing network model to model each server. An iterative multi-class Mean Value Analysis algorithm has been modified to reflect intra-request parallelism and take into consideration the software limitations on the number of requests that can be concurrently processed by the server (MPL) and limitation of the CPU utilization by workload.

Modeling parameters are dynamically recalculated during the modeling iterations to reflect contention for memory between requests:

• Response time of DBMS server workloads is calculated during workload characterization based on measurement data extracted from SQLArea for SQL requests that belong to the corresponding workloads.

• Response time of the workload’s request to application server includes the application server’s own response time and total time the request will spend in DBMS server, calculated as multiplication of the average DBMS response time for corresponding workload by the number of calls to DBMS server per one request to the application server.

• Measured response time of the Web server includes average response time of the Web server plus corresponding response time of the application server.

• Calibration results synchronize and coordinate all model parameters, correct inaccuracies of measurement data and workload characterization assumptions.

• To be able to calibrate each server independently, calibration should fix external parameters: response time, throughput, think time and the average number of processes generating requests. Only CPU utilization and I/O rate are redistributed between workloads while preserving total node metrics.

6 Paper #415

Page 7: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Figure 4. Iterative Modeling Algorithm Applied to Interdependent Queueing Network Models of Servers in a Multi-Tier Distributed Environment.

If one physical server hosts both VM/AS and VM/DBMS servers, the interconnection between VMs should be taken into consideration. Bottom-up effect incorporates next tier delay change for calling server workloads. Top-down effect reflects change of concurrency level (equivalent number of parallel sessions) and equivalent think time of workloads on the called server. Queueing network MVA algorithm of each server should take into consideration Regression Analysis coefficients to predict how different planned changes will affect response time (service, queuing and delays), throughput and resource utilization for each workload.

MODEL EVALUATIONPrediction of the performance characteristics is done iteratively for each planning period for each workload starting with the DBMS server, then AS and WS according to the evaluation scenario, taking into consideration on every iteration new values of the performance characteristics of lower and upper tiers.

The lower tier affects the corresponding component of the server’s response time, and the upper tier affects equivalent think time, which is the sum of user think time and the time the request of this workload spends in other servers of the modeled system.

One common system model reflects all interdependences between servers.

For DBMS, assume:• Think Time DBMS = Think Time User + WS own RT + AS own RT

• Predict RT DBMS for each Wkld and Other Characteristics

For AS, assume:• RT AS = R cpu as + R i/o as + SUM Delays + Rdbms

7 Paper #415

Page 8: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

• Think Time AS = Think Time + WS own RT

The described approach allows us to model heterogeneous hardware configurations that include servers and nodes within the servers with multiple CPU and shared memory, host channels, control units, disk channels and disks as separate servers with queues where requests wait for the service.

Increased contention for memory resulting in paging and swapping is modeled by the change in I/O rate based on output of regression analysis.

The model allows us to evaluate how proposed changes of priorities, concurrency, and request parallelism will affect each workload’s service time, queueing time, delays and response time.

Fast model calculation allows us to use a numeric gradient-like optimization algorithm in the multi-dimensional space of the configuration parameters.

PREDICTING IMPACT OF THE WORKLOAD AND DATABASE SIZE GROWTH The graph below shows the result of performance prediction showing the impact of the workload and database size increase in a multi-tier environment with the two application servers and two DBMS servers shown in Figure 1. As one can see in Figure 5, the response time of one of the workloads will not be affected, but the response time of the Admin Cluster workload will be significantly increased.

Predicted Impact of Workload and Database Growth on Response Time and Throughput without Virtualization

Figure 5. As a result of the expected workload growth the response time for the Physical Cluster and Admin Cluster workloads will grow significantly, but other workloads’ response times will not be significantly affected. Throughput for all workloads, except DBora1-Catch All will be reduced by 10-30% in a year.

One of the areas where analytical queueing network models can provide value is selection of candidates for virtualization. Planning virtualization includes analysis of many alternatives, including selection of candidates for virtualization, justification for the size and number of the physical servers required to support selected VMs, making decisions about which VMs should be placed on which physical servers, etc. Let’s review performance prediction results based on measurement data collected on the system shown in Figure 1.

8 Paper #415

Page 9: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

CPU of the Application server #1 is underutilized, while the utilization of Application server #2 CPU is almost 100% (Figure 6). We have unbalanced application servers, and one of the solutions is to evaluate the impact of virtualization: what if we place Application server #1 workloads into VM1 and workloads of Application server #2 into VM2 onto the physical application tier server #2?

Predicted Impact of Workload and Database Growth on CPU Utilization of Application Server #1 and Application Server #2 Without Virtualization

Figure 6. Application server #1 is underutilized, but Application server #2 is saturated. CPU utilization of Application server #1 will be growing from 7% to almost 12%, and Application server #2 CPU utilization will be saturated at the level of 100%.

PREDICTED IMPACT OF VIRTUALIZATION Replacement of the physical servers with VMs and placement of the VMs on one host with double CPU capacity as shown in Figure 7 where hypervisor controls VMs to balance the utilization of resources, but the impact of the workload growth on performance of application is a concern. Analytical models take into consideration workload profiles [5], and also the hypervisor overhead, which will be increasing with workload growth. Performance prediction results shown in Figure 7 illustrate that response time and throughput of different workloads will be affected differently depending on workloads’ profiles. Prediction results set realistic expectations, reduce risk of surprises and provide the information to review different proactive performance management measures, including change of workloads’ priorities, level of concurrency, resource allocation, etc.

9 Paper #415

Page 10: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Figure 7. Physical Cluster and Admin Cluster workloads response time and OE workload throughput will be very sensitive to workload growth, and CPU utilization of the host server will be increased from 60% to 85% in a virtual environment with two VMs, representing workloads from Application servers 1 and 2.

PREDICTED IMPACT CHANGE OF WORKLOAD PRIORITY Performance prediction results can be used to identify when SLO for critical workload is not satisfied, where there is a potential problem (is it the application server or the DBMS server, CPU or I/O, service time or queueing time, or are delays caused by concurrency limitations, etc.). One of the possible tuning options is to increase dispatching priority for the critical workload. Figure 8 illustrates the performance prediction results reflecting the impact of the proposed change of the priority for one of the workloads. As a result, this workload will have improvement in response time and throughput, but it will negatively affect performance of other workloads. The model takes into consideration that all workloads compete for resources, and improvement in one place can create a bottleneck in another. Sometimes improved performance on an application server can create a bottleneck in a DBMS server and vice versa.

Figure 8. Increase of the priority for VM hosting Physician Cluster workload will improve response time for this workload, but other workloads, especially Admin Cluster workload, will be negatively affected. As a result of the change, throughput and CPU utilization by each workload will be affected as well.

10 Paper #415

Page 11: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

PREDICTED IMPACT OF CONCURRENCY LIMITATION Enforcement of concurrency limitation for one of the workloads limits the number of concurrent requests for this workload and can have very different impact on all other workloads. Reduction in the number of JVM threads in application server can limit resource consumption for one workload, but increase consumption of resources by other workloads using a different JVM. It can move a bottleneck from the application server to DBMS server. Performance prediction results in Figure 9 illustrate the change in response time, throughput and CPU utilization as a result of change in level of concurrency for Physician Cluster workload.

Figure 9 . Implementation of throttling and limitation of the level of concurrency for Physician Cluster will increase response time for Physician Cluster workload, but will significantly improve the response time for all other workloads.

PREDICTED IMPACT OF CHANGE IN CPU CONSUMPTION LIMIT Limitation of the CPU utilization by one of the workloads can have similar impact. As is shown in Figure 10, setting a limit on CPU consumption on the DBMS server for OE workload will increase the response time and reduce throughput for OE workload, but significantly improve performance for other workloads, especially DBora1. Modeling results take into consideration not only contention for resources of the application server, but also the impact on performance of the DBMS servers.

11 Paper #415

Page 12: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Figure 10. Setting limit on CPU consumption on the DBMS server for OE workload will increase the response time and reduce throughput for OE workload, but significantly improve performance for other workloads, especially DBora1

PREDICTED IMPACT OF THE HARDWARE UPGRADE

Figure 11. As a result of adding a new node to Oracle RAC system, response time improved and throughput increased. CPU time consumed increased because the throughput increased.

Performance prediction results allow evaluation of the impact of the proposed Oracle RAC hardware upgrade on each workload. Shared disk subsystems, variable degree of parallelism, contention for the interconnect, and memory limitations can affect Oracle RAC scalability. Potentially, it also can limit the ability of RAC to provide consistent service for dynamic environments with mixed workloads.

Additional nodes will allow redistribution of requests between nodes, and arrival rate to each node will diminish, which will reduce average CPU utilization, and each node will process requests faster. Users will wait less time for the response and will be able to generate more requests. It is positive, because the system will be able to process more business transactions – systems’ throughout will increase.

PREDICTED IMPACT OF ORACLE TUNING - New Index Creation Impact According to the performance prediction results, after adding a new index, performance of one of the workloads will be improved, but several workloads that do not include SQL accessing tables with the new index will experience performance degradation.

12 Paper #415

Page 13: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Figure 12. Prediction results show different expected improvements of the new index creation on response time for different workloads. OE and DBora1 workloads will benefit the most, while Admin Cluster and Physician Cluster will not. Modeling expectation provides the base to compare the actual results with expected, and to verify that the goal of the change is reached.

PREDICTED IMPACT OF ADDING NEW VM CONTAINING NEW APPLICATION DENTIST ON NEW VIRTUAL SERVER

Tuning will reduce contention for storage subsystems and DBMS throughput will increase. Suddenly the maximum number of JVM treads will become a bottleneck. Increase in the number of JVM threads will increase the number of concurrent requests within the application server and the amount of heap memory used by all concurrent requests. Heap size is limited by 2 GB, and creation of an additional JVM will be required to support the increased number of concurrent requests. Creation of a JVM within the same application server will increase contention for the CPU, and adding a new physical application server will be required. Creation of a new JVM on a new application server will balance the application servers’ utilization and reduce time requests spent within the application tier, but it will increase the arrival rate of requests to the DBMS tier, and increase contention for the DBMS server again. The DBA can decide to increase a degree of parallelism or change priority and resource allocation for one of the workloads, but it will affect different workloads differently. Some of them can benefit by that, but some of them not.Modeling results show that adding a new VM containing a new application accessing data from the same DBMS server will increase the contention for the DBMS and affect performance of all workloads.

13 Paper #415

Page 14: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

Figure 13. Predicted impact of adding a new VM containing a new application DENTIST on a new virtual server accessing data from the same DBMS server

OPTIMIZATION AND AUTOMATION OF THE PERFORMANCE MANAGEMENT DECISIONSEach workload has unique performance, resource utilization and data usage profiles, and changing the hardware and software configurations, applications and database tuning can affect performance of workloads differently. Finding the best configuration and rules defining concurrency, priority, resource allocation and migration of VMs and JVMs between virtualization servers to support individual SLOs is a very difficult task.

Analytical models can be used to evaluate different options to justify workload management, performance tuning and capacity planning decisions [7,8,9].

To make the model independent from the number of servers in the system, we can build models hierarchically. Each server is modeled by the separate queuing model where called servers are included as additional data sources, and calling servers determine the equivalent number of users (sessions) and the equivalent think time. The whole model is solved iteratively server by server with several iterations until convergence.

Dynamically adjusted evaluation scenarios can be used to evaluate different software parameters and find the level of concurrency, priority and resource allocation for each workload that will satisfy SLO for each workload, and minimize Total Cost of Ownership (TCO).

The algorithm starts with evaluating the impact of the software parameters change. If change of the software parameters is not sufficient to satisfy SLO, then migration of VMs and JVMs to balance usage of resources is evaluated. If that is not enough, then hardware capacity increase on overutilized servers is considered.

One can describe such optimization as a multi-criteria (each workload has own SLO), multi-dimensional (system and workloads’ software parameters and hardware parameters) optimization. It’s not possible to optimize for any workload separately because they all use the same physical resources, thus, affect each other. There’s not even approximate analytical expression that link variables and goal functions. We have to run the multi-tier model for each and every combination of software and hardware parameters of all workloads and all servers to get the corresponding performance metrics and compare them with SLO.

The following steps make the search for the optimum solution more effective:1. Select the workload where SLO will not be met first

14 Paper #415

Page 15: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

2. Select the server corresponding to the greatest component of the workload’s system response time.

3. Look at the components of the workload’s request elapsed time on this server and make corresponding actions: if the request spends the most time on (or waiting for) CPU, increase the workload’s CPU limit or priority; if the request spends the most time waiting for execution thread, increase the number of threads available, if the available memory allows (application tier, there are similar concurrency limitation in DBMS as well); etc.

Each such change affects all workloads, so after the model runs another workload can violate its SLO earlier or another server can become a bottleneck.

If after workload controls adjustment all SLOs are not satisfied, we turn to the server-level software parameters mostly available for virtual servers: CPU share, memory share (affects swapping IO rate), etc. Finally, if all attempts to change software parameters, move VMs and JVMs cannot satisfy SLO, hardware upgrade is evaluated.

SUMMARY Complexity of virtualized multi-tier distributed Oracle environment makes it difficult to

plan and manage dynamic environments effectively. We presented a methodology and approach to modeling of the complex, multi-tier

distributed environment with virtualization.

We demonstrated how modeling and optimization improves effectiveness and reduces risk of performance surprises during planning and management of the virtualized, multi-tier distributed Oracle environments.

We reviewed how to model the impact of the workload growth and other changes on hypervisor overhead.

We demonstrated how performance prediction and optimization technology allows evaluation of different options, setting realistic SLO, finding virtualization candidates, predicting the impact of workload growth and adding new VMs, justification of migration of VMs, predicting impact of new applications implementation, justification of the application tier servers and Oracle DBMS servers hardware upgrades and provides a framework to organize a continuous proactive performance management process.

REFERENCES 1. B. Zibitsker, IOUG 2008. Reducing Risk of Surprises in Changing Multi-tier Distributed

Oracle RAC Environment2. B. Zibitsker, DAMA 2007, Enterprise Data Management and Optimization3. B. Zibitsker, CMG 2008, Hands on Workshop on Performance Prediction for Multi-tier

Distributed Environments4. J. Buzen, B. Zibitsker, CMG 2006, Challenges of Performance Prediction in Multi-tier

Parallel Processing Environments5. B, Zibitsker, G. Sigalov, A. Lupersolsky Modeling and Proactive Performance

Management of Multi-tier Distributed Environments, International conference "mathematical methods for analysis and optimization of information and

15 Paper #415

Page 16: Modeling and Optimization in Virtualized Multi-tier Oracle

Virtualization Optimization

telecommunication networks" (Byelorussian Winter Workshop in Queueing Theory – 2007)

6. Mark Friedman and Stephen Marksamer, Measure IT, March 2007 A Realistic Assessment of the Performance of Windows Guest Virtual Machines

7. Nocedal, Jorge. Stephen J. Wright, Numerical optimization, ISBN 0-387-98793-2

8. Michael W. Trosset, Numerical Optimization Using Computer Experiments, Adjunct Associate Professor, Department of Computational & Applied Mathematics, Rice University, Houston, TX, Virginia Torczon, Assistant Professor, Department of Computer Science, College of William & Mary, Williamsburg, VA

9. Charbonneau, High altitude observatory, national center for atmospheric research, Boulder, Colorado

10.Blog: bezsys.blogspot.com

16 Paper #415