csc 2720 building web applications improving web application performance
TRANSCRIPT
Introduction To improve the performance of a web application
could mean To reduce latency
i.e., to reduce the time delay between sending a request and receiving the corresponding response, or between sending a request and receiving all the page components, such as images, that are needed to render the page.
To serve as many concurrent requests as possible without failing or exceeding a response time limit Response time – the time a server takes to serve a
request
PHP Performance Optimization Obtaining good performance is not merely writing
fast PHP scripts. High performance PHP requires a good understanding of the underlying hardware, the operating system and supporting software such as the web server and database. Source: http://phplens.com/lens/php-book/optimizing-
debugging-php.php
Often involves trade-off among CPU, storage, bandwidth and other resource requirements
Factors that affect the performance Running out of resources
Processors Memory Storage Network bandwidth # of the maximum database connections
Poorly designed database schema and queries Poorly written PHP code Too much disk access
Locating the Bottlenecks – Profiling To measure the behavior of a server-side script as it execut
es, particularly frequency and duration of function calls (Ref: Wikipedia)
Help you detect which parts of your code are worth your attention (e.g., functions that are called the most, functions that take a long time to run)
Can be used to analyze the performance of a database indirectly by measuring the functions that interact with the database (e.g., The mysqli_* functions)
e.g. of Profiling Tools XDebug: http://www.xdebug.org/
Tests a web application for its robustness, availability, and error handling capabilities under a heavy load, particularly to ensure the software doesn't crash in conditions of insufficient computational resources (such as memory or disk space), unusually high concurrency, or denial of service attacks. (Ref: Wikipedia)
Use in conjunction with other tools to find out The maximum # of requests/users a server can handle
before failing or slowing down significantly The bottleneck (i.e., which resource run out first?) The average time a page and its components to fully
load.
Locating the Bottlenecks – Stress Tests
Locating the Bottlenecks – Stress Tests
Microsoft Web Application Stress Tool Can emulate HTTP requests (with parameters)
from clients
Can generate large number of requests within a short period of time (stress test)
Can help you gather info such as the average and the highest latency experienced from accessing a URL
Locating the Bottlenecks – System Info Show the usage and availability of various system
resources (e.g., CPU, memory, virtual memory, I/O). The usage can be the overall usage or usage by individual processes.
Use in conjunction with stress tests tools to locate the bottleneck of a server
e.g. of tools "Task Manager" on Windows OS The command "tops" on most Unix/Linux OS
Improvement MethodsSmall changes that could make a big difference.
1. PHP accelerators (Opcode compiler + Opcode cache)
2. Content caching Utilizing client's and proxy's cache Cache output
3. Server-Side web proxy
4. Compression
5. Connection pooling
Improvement Methods
Fine tuning your web applications, servers, and OS
6. Reducing number of HTTP requests
7. Query optimization
8. Optimizing PHP code
9. Additional methods that make a page loads faster
10.Tuning the Server (Apache)
1. PHP Accelerators Typically made up of an opcode compiler and an
opcode cache Opcode compiler – compiles PHP code into opcode Opcode cache – keeps frequently used compiled PHP
scripts in memory
A PHP accelerator can help reducing the response time of PHP scripts significantly because Interpreting opcode is faster than interpreting PHP code Loading opcode from memory is faster than loading PHP
scripts from disk
List of PHP accelerators: http://en.wikipedia.org/wiki/PHP_accelerator
2.1 Content Caching – Utilizing Client and Proxy Caches
Request clients/proxies to cache reusable components (e.g., images, scripts and stylesheets) in order to avoid retransmitting the same components
ProxyClient A
Client B
Server index.php
x.jpg
y.css
To illustrate, suppose • Clients A and B share a proxy server.• The HTML page generated by index.php needs x.jpg and y.css.• Only x.jpg and y.css are cacheable.
ProxyClient A
Client B
Server
ProxyClient A
Client B
Server
The 1st time client A requests index.php from the server, all three files need to be transferred from the server.
After the request, x.jpg and y.css are chached in the client A's and proxy's cache.
index.php
x.jpg
y.css
index.php
x.jpg
y.css
ProxyClient A
Client B
Server
If client B accesses index.php after client A has accessed the same file, then client B could load the page faster because the proxy server only needs to retrieve the HTML content from the server.
In practice, there could be more than one proxy servers between the clients and the server.
ProxyClient A
Client B
Server
In subsequent requests, client A only needs to download the HTML page generated by index.php.
index.php
x.jpg
y.css
index.php
x.jpg
y.css
2.1 Content Caching – Utilizing Client and Proxy Caches
Use the Expires or Cache-Control header fields to tell the clients and proxies how a component should be cached
e.g., When should the component be considered as expired? Is the component cacheable?
Recommendations For static components: implement a "Never expire" policy by setting
far future Expires header.
For dynamic components: use an appropriate Cache-Control header to help clients with conditional requests.
2.1 Content Caching – Utilizing Client and Proxy Caches
For examples To indicate that a component expires on a fixed time and date
Expires: Sat, 11 Apr 2009 20:00:00 GMT
To indicate that a component expires in one hour (relative to the access time) and the client must revalidate the content with the server when the component becomes stale
Cache-Control: max-age=3600, must-revalidate
To set caching policy for static components, you can configure the web server. To set up expiration policy for different files with Apache, see
module mod_expires or these examples. To set up default header values for different files with Apache, see
module mod_headers or these examples.
2.1 Content Caching – Utilizing Client and Proxy Caches
References and Reading Materials Caching Tutorial
Contains specific info about how (and how not) to cache http://www.mnot.net/cache_docs/
Working with cached pages in PHP http://www.badpenguin.org/docs/php-cache.html
HTTP conditional requests in PHP http://alexandre.alapetite.net/doc-alex/php-http-304/index.en.htm
l
Use Server Cache Control to Improve Performance http://www.websiteoptimization.com/speed/tweak/cache/
2.2. Content Caching – Reusing Generated Output
If a script only generates new content periodically, cache the generated output to avoid executing code and querying database for every request.
Examples of cacheable output List of high scores for an online game List of products on an e-commerce website in which the products
are updated daily
PHP examples about output caching Caching output in PHP
http://www.addedbytes.com/php/caching-output-in-php/
Output Caching with PHP http://www.devshed.com/c/a/PHP/Output-Caching-with-PHP/
3. Server-Side Web Proxy Use a web proxy at the server side to relieve the
web server from serving frequently requested static files
An example of web proxy: Squid http://www.squid-cache.org/
4. Compression Reduce the data size before transmitting
1. (Online) Use HTTP Compression – Compress textual data on the fly before sending them to a client
Can typically reduce the size of textual data by 70%
2. (Offline) Use compression tools to reduce the file size of JavaScript, CSS, Images, Video, etc.
The compression tools must not change the file format or the content of these files. Otherwise the files cannot be referred from HTML files.
e.g., use optipng for PNGs, gifsicle for GIFs and jpegoptim for JPGs
4.1. Compression – HTTP Compression
A publicly defined way to compress textual content transferred from web servers to browsers
Compression is done at the server.
Built into HTTP 1.1 and is supported by most browsers
Drawback: Takes time and CPU cycles to compress
Ref: http://www.websiteoptimization.com/speed/tweak/compress/
Using HTTP Compression in PHP Configure php.ini to enable automatic HTTP compression
zlib runtime configuration (http://hk2.php.net/manual/en/zlib.configuration.php)
Perform HTTP compression in PHP scripts programmatically Examples of using ob_gzhandler (http://hk2.php.net/ob_gzhandler)
5. Connection Pooling (Why?) A database connection incurs overhead – it requires
resources to create the connection, authenticate it, maintain it, and then release it when it is no longer required.
The overhead is particularly high for Web-based applications. A server-side script typically opens a connection, performs few
queries, and then close the connection. Often, more effort is spent connecting and disconnecting than is
spent during the interactions themselves.
Ref: IBM WebSphere App. Server – What is Connection Pooling?
5. Connection Pooling A connection pool is a cache of opened database
connections.
When a script needs to establish a connection to the database, a connection is selected from the pool if one is available. Otherwise a new connection is created.
When a script closes the connection, the connection is not actually closed but returned to the pool so that the connection can be reused by other scripts.
Note: Implementing a connection pool is not easy. Usually we just use it if it is available.
5.1. Connection Pooling in PHP PHP's Persistent Connection
Use mysql_pconnect() to open a persistent database connection There is no equivalence in MySQLImproved (mysqli)
extension.
Must be used with care because changes made to the database states, such as setting autocommit to "off", will affect the next script that uses the connection.
Other connection pooling solutions: SQL Relay: http://sqlrelay.sourceforge.net/index.html Apache Module mod_dbd:
http://httpd.apache.org/docs/2.1/mod/mod_dbd.html
6. Reducing # of HTTP Requests (Why?) A large portion of the total response time to create
a fully rendered page is spent on downloading the page components like images, stylesheets, JavaScript, etc.
Some browsers only allow at most two concurrent requests per server. That means the page components have to take turn to load.
Ref: http://developer.yahoo.com/performance/rules.html
6. Reducing # of HTTP Requests Reducing # of components reducing # of HTTP
requests Page loads faster
Methods to reduce the # of page components Combine multiple stylesheets into one
Combine multiple scripts into one
CSS Sprites – Tile multiple images into one image and then make use CSS to clip the needed image from the combined image.
7. Query Optimization Tune DB Schema
First three normal forms help ensure data integrity Denormalization – a process that attempts to optimize the
performance of a database by adding redundant data or by grouping data (but makes maintaining data integrity difficult).
Query only what you really need e.g., instead of using "SELECT *", select only the columns you need
and use LIMIT to limit the number of rows retrieved from the DB
Make use of indexes to improve the performance of data retrieval
Take a database course …
Ref: How to Optimize Queries (Theory an Practice) http://www.serverwatch.com/tutorials/article.php/2175621
8. Optimizing PHP Code Make use of output buffer
See PHP Output Bufering Control: http://hk2.php.net/ob_start
<?php ob_start(); // Start output buffering // All the output are kept in memory instead // of sending to the client.?><html><head><title>Foo</title></head><body><?php echo "Blah Blah Blah"; ?></body></html><?php ob_end_flush(); // Flush everything in the output buffer // to the client at once.?>
8. Optimizing PHP Code
echo $str1, " ", $str2, " ", $str3;
executes faster than
echo $str1 . " " . $str2 . " " . $str3;
Note: This only works with echo, which is a function that can take several strings as arguments.
If you have CPU-intensive tasks to perform, consider implement them as C extensions.
Use the predefined functions (as oppose to writing your own functions) whenever possible
8. Optimizing PHP Code
Instead of writingfor ($i=0; $i < count($array); $i++)
…
Use a variable to store the array size and rewrite the loop as
$array_size = count($array);for ($i=0; $i < $array_size; $i++)
…
More tips about optimizing PHP code can be found at http://reinholdweber.com/?p=3
9. Additional Methods that Make a Page Loads Faster
Post-load Components Load the less important components on the background
Preload Components Anticipating what components are needed in the future and pre-load
them (i.e., utilizing browser's idle time)
Split Components Across Domains Maximize parallel downloads (a browser may only issue a few HTT
P request in parallel to the same sever) Make sure you're using not more than 2-4 domains because of the
DNS lookup penalty.
Ref: Best Practices for Speeding Up Your Web Site (http://developer.yahoo.com/performance/rules.html)
10. Tuning the Server (Apache) SendBufferSize – Size of output buffer
MaxClients – Maximum # of clients
StartServers – The number of child processes to create at start up
MinSpareServers, MaxSpareServers – The number of idle child processes to keep alive
Keep-alive – tells the server to reuse the same socket connection for multiple HTTP requests to reduce the overhead of frequent connects
Source: http://phplens.com/lens/php-book/optimizing-debugging-php.php
Scaling To improve the performance by introducing more
machines (to host more servers) Server Clusters
Database Replication Improve performance or availability of the whole database syste
m MySQL Replication
(http://dev.mysql.com/doc/refman/5.0/en/replication.html)
References Best Practices for Speeding Up Your Web Site
http://developer.yahoo.com/performance/rules.html
Performance Research, Part 1: What the 80/20 Rule Tells Us about Reducing HTTP Requests http://yuiblog.com/blog/2006/11/28/performance-research-part-1/
Performance Research, Part 2: Browser Cache Usage - Exposed! http://yuiblog.com/blog/2007/01/04/performance-research-part-2/
Practical PHP Performance http://www.developertutorials.com/tutorials/php/practical-php-perfor
mance-8-02-07/page3.html