web-application's performance testing ilkka myllylä
DESCRIPTION
TRANSCRIPT
Web-application’s performance testing
Ilkka Myllylä
Reaaliprosessi Ltd
Agenda
Research results about web-application scalability Performance requirements specification Load testing tools Load test preparation Load test execution Analysis
– What is bottleneck and how to find it ?– Optimization and tuning– What is root cause and how to fix it ?
Research results about web-application scalability
Is scalability problem ?
Research results (Newport Group Inc.) Scalability in production
– as required 48%– worse than requirements 52% (average 30%)
Late from timetable and costs overrun– as required 1 month and 70 000 euros– worse 2 months and 140 000 euros
What explains bad results ?– performance testing timing is most important factor
Performance testing timing
Performance testing timing
06
26
8
60
21
3538
06
0
10
20
30
40
50
60
70
Plan a
nd d
esig
n
Begin
ning
of im
plem
enta
tion
End o
f im
plem
enta
tion
After d
eploy
men
t
Not te
sted
Worse
As required
Early start of testing is profitable
Efficiency– one and 2-5 user tests show average 80% of bottlenecks
Costs– late fixing costs many times more -> wrong architecture risk for
example
When to test performance ?
Architecture validation
Component performance
New application system performance test
Major changes to application made
Realistic performance testing is not easy
Too optimistic results are common problem– ”It worked in tests !” says 32 % (Gartner Group)
Bad environment– Testing environment not equal to production
Bad design + implementation– Load testing tool not used right way
Wrong tool for a job– Load test tool used not functional / does not include features needed
Performance requirements
specification
Performance requirements testability
Requirements should be exact enough in order know how to test them Usage information
– Different users and their profiles ?– Transaction amounts ?
Response times – Detailed response times requirements ?
Technical constraints– User terminals, software, connection speed ?– Technical architecture– Database size
Security and reliability constraints
Completeness of requirements
Sensibility check : Are customers requirements / calculations right and sensible ?
Completeness : What information is missing ?– Checklist
System usage information
Goal : Real usage simulation that is close enough to reality Business / use case scenarios are good starting point Close enough ?
– Not all functions are tested but most important and most used are tested
– 3-5 scenarios is usually enough for one application Usage information : History info available ?
– Yes : Future scenarios ?– No : Estimation and calculation with known facts
Transaction profile
Business scenarioTrans / h
Typical dayTrans / h Peak day
Web server activity DB activity Risk
Log in 70 210 Heavy Light HighCreate new account 10 15 Moderate Moderate LowCreate order 130 180 Moderate Moderate ModUpdate order 20 30 Moderate Moderate HighShip order 40 90 Moderate Heavy High
User profiles
Business sce/User Order Entry Clerk Order Clerk Shipping Clerk OverallConcurrent sessions 15 5 10 30Log in 15 5 10 30Create new account 5 10 15Create order 100 30 130Update order 10 20 30Ship order 90 90
Does customers requirements / calculations make sense ?
Expectations can be too high– ”Response time for all functions should be below 2 sec”– ”In old system everything was very fast”
Different functions are not equal– requirements should be set individually for important use casesand their functions
Costs should be involved– less response time, more costs– technically ”chalenging” requirements ?
How long user is willing to wait ?
Not simple thing– different users have different requirements for same functions
Research results (Netforecast Inc)– two important factors : frequency and amount – strong variation between applications and functions– satisfactory average 10 s
Frequency of use
Frequency : How often user needs to use function ?– more often user needs, less he/she is willing to wait
Example 1 : Frequency– Use case : Search customer info– A. Once a hour : requirement for response time 5 s– B. Once a month : requirement for response time 30 s
Amount of information
Amount : How much valuable information we get as result ?– more information one get, more he/she is willing to wait
Example 2 : Amount– A. Saving function for few input fields : requirement for response time
3 s– B. Search function for product information (100 fields) : requirement
for response time 10 s
Response time requirements
Business scenarioMaximum
acceptable 95%Satisfactory
95 %Log in 10 5Create new account 10 5Create order 8 3Update order 8 3Ship order 15 8
Performance requirements in contracts
Requirements for performance testing– Appendix about performance requirements– Test engineer validate testability– Both customer and supplier benefit
Good for customer– No extra costs late in development cycle
Good form supplier– No new ”surprise” requirements late in development cycle
What about performance testing tools ?– expensive investment
Load testing tools
Types of performance testing
Concurrency– one to several users concurrently in risk scenarios– manual testing normally
Performance– requirements for scalability ?– load testing tool needed
Load– max number of users ?
Reliability– long usage possible without degrading performance ?
Load testing tools - markets
Mercury Interactive market leader (> 50%) Big six have 90% of markets 100’s of small companies Prices for major tools quite high Growing market
Load testing tools– options to get one
Buy licence (+ consulting)– usually virtual user count pricing
Buy service– load generation externally
Rent licence (+ consulting)– need for only limited time and maybe for just this one project
Load testing tools – main functions
1. Recording and editing scripts
2. Scenario design and running
3. Analysis and reporting
Load testing tools - features
Lots of obligatory features – script recording– parametrizing– scenario design
ramp up and weighting different scripts running concurrently
– online results / feedback error detection transaction level response times
– http protocol support
Load testing tools - features
Lots of usually needed features– Distributed clients– Unix/Linux clients– Multi protocol support– Multi speed support– Multi browser support– Server monitoring – Content validation– Dynamic urls supported
Features - Make or buy
Some features are possible to be done manually– server monitoring– analysis and reporting
Usability– best tools are really easy to use– others need lots of work and ”programming experience”
Workarounds– more features than promised with clever trics
Good tool combination
Separate load and monitoring tool– even from different vendor ?– how about profiling ?
Script reusability– same scripts for functional and load testing
Load test preparation
Testing environment
Same as production environment– Other applications dividing same resources (firewall etc) ?
Controllable– No outside disturbance
”Basic” optimization made
What is basic optimization ?– Server parameters are validated by responsible persons and list of
values given to load testers– Database : sql performance checked and necessary indexes exists
Without basic optimization load test is waste of time – just first obvious bottleneck is found and no real information exists
Load test cases
First each script separately with ramp-up usage– easier to see what is problem straight away
Real usage scenario with weighted scripts– usually many test runs before goals are achieved’– time for repeated tests– usual usage first then special occasions
What-if scenarios– one change at time to see influence of changing factor
Risk based testing– different location and speed testing– hacker testing -> Dos attack etc
Script selection
User or process scripts ?– both are possible
Example : Petshop application – user oriented– Create order - returning customer – Create order - new customer– Searching customer
Example : Petshop application – process oriented– Registeration – Create order– Search order
Script recording and editing
Script = Program code (C, Perl etc) to execute test automatically
Basic recording – execute test case with recording on– check and set recording options before start– generates script
Editing– parametrizing– transactions– think time changes– checkpoints– comments
Parametrizing
Recorded script includes hard coded input values– If we execute load test with hard coded values results are not
realiastic (too good or bad)
Parametrizing = Different input values for different virtual users– all users of system have different user information– more realistic load
Test data generation
Parameter data with right distribution– Generation of test data to text files which load tools can use
Real amount of data in databases – Backup and restore procedures
Transaction
Detailed response time information inside script
Exact execution times and problem transactions could be seen– script with 10 transactions -> when response time increase, are all the
tranasactions equally slow or just some ?
Checkpoint
Functions in script that check correctness of results during execution
In some tools could be set automatically – Others need manual implementation
Find errors otherwise not seen
Think time
Think time = Time user uses for looking and input before making next request to server
Important parameter when estimating usage– Less think time means more frequent request and more load to
servers
Example – 100 users logged to system with 10 s average think time = 1 user 6
transactions/ minute and 100 users 600 t/min = 10 tps– If think time is 30 s load is 3 tps
Comments
Making and testing scripts = software development– comments for meintenance– tools own naming is not always good -> changes needed to get
readability
Script testing
Executing single script succesfully – at least twice– with checkpoints and parameters
Scenario creation and testing
Usage information and ramp-up of different scripts in same scenario
Designed counters available and working for testing
Test run with couple of users
Ramp-up
Ramp-up– User amount increase little by little– In real life usually amounts does not change immediately– When user amount increase little by little, it is easy to see how
response time and utilization develop– Stabilizing before next level of load
Example : 1000 users use system at same time– first 50 users then 50 more every 10 minutes until response time is
bad or errors start to increase
Collection of performance counters
Responsiblity of getting performance counters is usually divided between
– administrators– developers– testers -> Load Tool monitors
Load tool monitors should tested– not always so easy to get information from servers as vendor says
Load test execution
Reset and warm-up
Reset situation– Old tests should not influence
Warm-up– Before actual test, some usage need to done
Synchronizing people involved
Test manager gets ready from all people involved
When test ends syncronization again to stop monitors
Collection of results
Active online following
Counters– following online monitors – response time and throughput– client and server system counters (cpu, memory, disk)
Error messages– if lot of errors occured test should be stopped– errors occur often before application run out of system resourches
Response time
Most important counter for performance Response time = time user needs to wait before able to continue Industry standard for response time : 8 seconds With response time usage information is needed too
– simultaneous user amount and what most of they are doing Example
– 100 simultaneous sessions, 50% update and 50% search – Response time requirement 4 s to 95% of insert and update of bill
insert. To other functions requirement is 8 s.
Throughput
Another important counter for validating scalability Amount of transactions, events, bytes or hits per second usual counter tps (=transaction per second) Requirements could be told as throughput value Bottleneck could be seen easily with saturation point of throughput
Throughput and response time
Performance ”knee”
Performance vs users
0.00
20.00
40.00
60.00
80.00
100.00
120.00
20 40 60 80 100 120 140 160 180
Virtual users
Re
sp
on
se
tim
e
What is bottleneck and how to find it ?
What is bottleneck and why it is important ?
Any resource (software, hardware, network) which limits speed of application
– under requirements– from good to even better (changing requirements)
Bootleneck is result– reason should be analysed and fixed– for example disk i/o is bootleneck and fix is to distribute log file and
database files to different disks ”Chain is as strong as it’s weakest link”
– application is as fast as worse bottleneck permits
How can we identify bottleneck ?
Using application and measuring– response time and throughput– resourche usage measuring
One user– ralative slowness and resourche utilization (= not yet bottleneck but
possible to see that bigger amounts of users will couse one) Several users
– trends possible to see already (=1 user 1s, 5 users 3 s, 1000 users ? s)
Required amount of users– Actual max usage scenatio
Not so nice features of bottleneck
Real bottleneck influence load of other resources– ”everything influences everything”– when disk is bottleneck, processor looks like one too (but is not)– when real bottleneck is fixed other problem will be solved too– if we increase processor power, it does not help
Real bottleneck ”create” other problems but hide them too– first bottleneck should be solved in order to see next real bottleneck
Amount and finding of bottlenecks
One application has usually many bottlenecks– many changes are needed in order to
One test finds only one bottleneck– many iterations are needed in order to fix all bottlenecks
Most common bottlenecks in web-applications
Bottlenecks % in web-applications
40
30
20
10
0
5
10
15
20
25
30
35
40
45
Application server Database server Network Web server
Server counters and profiling
What counters and log/profile information do we need in order see bottleneck and root cause ?
Two levels of counters– system counters – cpu utilization %– application software counters – Oracle cache hit ratio %
Log/profile information– detail level resource usage information
Collecting system counters
Memory, CPU, network and disk counters could be collected– with operating system dependent programs like Windows
Performance monitor or Unixin sar,top etc– with load testing programs like LoadRunner or QALoad
Collecting with load testing programs is easier and information is in easy to analyze/report form
Counters for all four are needed
Interpreting system counters
Most important counters– CPU – queue length tells if it is too busy– Disk – queue length tells if it is too busy– Network - queue length tells if it is too busy– Memory – hard page faults (disk) tells if it is too small
However one counter is not enough– to be sure more counters are needed– to see root couse more counters are needed
Application counters
Collecting with load testing programs is easy and information is in easy to analyze/report form
However all counters are not available to load testing tools– online monitors (Websphere, WebLogic) could be used to
complement information
Different products have different counters– need for understanding that particular product
Profiling tools
Collecting exact information in call level– memory usage– disk i/o usage– response time
Collecting information may influence quite much to results– one solution is to make two test runs : one without logging/profiling
and other with them
Example 1 : One clear bottleneck
One of four system resources is busy– easy to see bottleneck
Example 2 : More than one system resources looks bad
However only one resource is real bottleneck– others are ”side effects” of real bottleneck
Example 3 : None of system resources looks bad
Where is the bottleneck then ?– usually some software application uses works inefficiently internally or
interface queue to external systems does not work efficiently
What is root cause and how to fix it ?
How to see root cause ?
Application level information is usually needed and always good to have
– Software code problems could be solved when we see which is slow function
Some root causes are easy to see while others needs sophisticated monitoring and profiling
Software implementation
Database server– Bad sql from performance point of view (works but not efficiently)– No or not good enough indexes used
Application server– Object references not freed -> too much objects in heap– Bad methods used from performance point of view
Idea is to decrease load to hardware resouches
Efficiency
Not efficient use of existing hardware resources – Parametrizing and configuring help
Capacity
Resource too slow for handling events fast enough
More resourches or reconfiguring existing resourches– Cpu from web-server to db server
Hard constraints and requirements
Client’s complicated business logic requirements– too much bytes needed in user interface (slow network speed)– too many different sources of information needed (syncronous)– long transactions; single function needs many chained updates
Security requirements– too much request to web server -> encrypted network traffic
Online data needed– many big updates needed immediately
Bad design
Application tiers– distribution of tiers possible (=EJB vs pure Servlet)
Technology – too much information in session object
Infrastructure– not compatible versions of different vendors from performance point
of view– needed functionality not available (= distibution not supported)
Tuning
Tuning– application– server software– operating system– network
Usually good choice– fast to do– risks to regression small
Usually tuning is not enough– changes are needed
Changing
Application code Application software Hardware Network infrastructure
Tuning vs change
Tuning is not so risky
Change is not always possible
In practise both are valid and equally considered
Example : Tuning vs change
Sales system has application server processor bottleneck Could be removed
– More processing power– Less processing needed -> application code change
If application logic need to be changed a lot– More processing power choosed
If application logic need to be changed a little– application code change choosed
If both are fast, easy and costs are low– both are choosed
Removing bottlenecks
Idea : Removing root cause of bottleneck one by one
Rerun same test to see influence
Testing part of system
Sometimes it is difficult to see bottleneck and root cause– More information is needed in order to understand system better– Testing just one suspect at time is usually possible but could need
much effort
Testing only one extent at time is ultimate way
Top – down optimizing
When there is plenty of time– not very fast, but efective
Idea : Optimize one level at time
-> Level by level readyness– No jumping between levels
Application code
Application software
Operating system
Hardware
Memory–cache-pool-area usage
Idea : Data or service that application needs is already in memory as much as possible
System level– big enough memory -> not much swapping needed– proxy server caches content
Application level– big enough database connection pool -> new objects not needed– big enough database sort area -> not much swapping needed
Connection and thread pools
Creating many objects at startup– new user gets object from pool– when used object returns to pool
Synchronous and asynchronous traffic
If possible actions could happen asyncronously (= no need to wait that action is ready)
Interfaces to other systems
Distributing load
Between servers– load balancing
Inside server– cpus– disks
Between networks– segments
Cut down features
Sometimes only possibility is to cut down features and requirements
– deadline too neat to make other optimizing– costs or risks too big when doing anything else
Making recomendations for correcting actions
Need usually interpretion of results from different persons– however understanding and criticality is needed
Results should be clear– usual ”It is not our software but yours” conversations could be avoided
if nobody can question results and recomendations
– need to show where problem is not !
Example : Internet portal
Application : Many background systems develop data to this portal Response time in USA:ssa 5 s, when connections are fast In Asia every connection takes 2 sec and moving elements
between server and client is slow too Logic : 12 frames inside each other Result : Opening first page takes 2 s*12 + 30 s = 54 sec Requirement : 8 sec
Corrective actions and ideas
Idea 1 : Faster connection– not possible -> thousands of internet customers
Idea 2 : Content nearer to customer– pictures partly to client workstations -> security regulations prevent partly– content to servers near customers (Content Delivery Network) – helps some but not enough
Idea 3 : Packaging of data– helps some but not enough
Idea 4 : Application logic change -> less frames– lot of costs– requirements achieved
Error and lessons learned
Internet users with slow connections and different geagraphical areas
– can be important user group– Technical design failed to this group
Perfotmance testing late in development cycle– too late– not simulated real usage good enough
Pilot users saved much– not widely used when problems we seen
Solution was found (as usual)– but fixing took much time and money