Download - Programming for Performance
-
8/9/2019 Programming for Performance
1/24
Programming for Performance
Alosh Bennett
-
8/9/2019 Programming for Performance
2/24
What is performance?
Can be expressed in terms of Computational Power
Memory footprint
Is perceived as
Responsiveness of the system Throughput
Scalability
Startup time
Other factors
Reliability of the system
Availability
Overall feel of the system, its quickness and responsiveness.
-
8/9/2019 Programming for Performance
3/24
How to build a performing system?
What are you building?
Web Application
Application should load fast
Faster request turn-arounds
Minimal net traffic
Reduce number of server calls
Asynchronous updates
Database Application
Fast updates and queries
Schema design
Indexing
Normalizing
Caching results
Connection pooling
There is no generic formula to build a performing application.
-
8/9/2019 Programming for Performance
4/24
Building an Application - stages
Requirements Gathering
Scope of the application
What it is and what it is not.
Real world estimation of usage
Application Architecture
Blueprint of the project
Identify the technologies
Components of the application
Interaction
Pseudocode
Logic of individual components
Algorithms and Data Structures
Coding Coding standards
Good practices
When should we start thinking about performance?
-
8/9/2019 Programming for Performance
5/24
Requirements Gathering
Scope of the project
Music player with media library
Manage up to 5000 songs in the library.
Play up to 5 songs simultaneously.
Performance Benchmark
Requires 1.8 GHz processor, 50 MB RAM, 20MB hard disk space
Startup under 5 seconds
What the application is not
Site to show top events of the day.
It will not show real-time data
Data would be fetched only once in few hours
Real world usage scenario
Web application How many concurrent users
Text Editor How big a file can it handle?
Do one thing and do it well.
-
8/9/2019 Programming for Performance
6/24
Real-Time update vs Periodic update
Master Server
Crawler
Crawler
Crawler
Real-Time
Updates
Client
Requests
Master Server
Crawler
Crawler
Crawler
Real-Time
Updates
Clone
Server
Clone
Server
Clone
Server
Periodic
Synch
-
8/9/2019 Programming for Performance
7/24
Application Architecture Right tool for the Right job
Blog
jsp or php
Online Transaction site
Industry grade server Apache, glassfish, weblogic
J2ee or similar framework
Database mysql, oracle, postgres
Rich content website
Javafx, flex, htm5
Avoid applets
Task Automation
Scripting languages python, perl, shell
Avoid java, C
Concurrent processing with multiple processors
Scala over java
Mathematical modeling Computation intensive
Functional programming over Object Oriented
XML Parsing in java
DOM Parser
SAX Parser
Stax Parser
-
8/9/2019 Programming for Performance
8/24
Application Architecture Harness the processing power
Single threaded vs Multi threaded
Picasa tool to upload pictures
Single upload takes 5 seconds
Single thread
10 pictures take 50 seconds
Multithreaded (5 workers)
10 pictures take 10 seconds
Identifying parallel tasks
Localize - Break the application into independent units Parallelize Execute the units in parallel
Picasa tool to resize and upload
Resize takes 5 seconds, upload takes 5 seconds
Single composite task (5 resize&upload workers)
10 pictures in 20 seconds
Two independent tasks (5 resize workers, 5 upload workers)
10 pictures in 15 seconds
20 pictures in 25 seconds
Many hands make light work
-
8/9/2019 Programming for Performance
9/24
Image Resize and Upload Parallel tasks
-
8/9/2019 Programming for Performance
10/24
Application Architecture Dont repeat the effort
Effective use of caching
Cache results that are costly to re-compute
Used effectively, improves the performance
Eg. Currency conversion rates in a Forex calculator
Avoid excessive caching
Monitor cache hit/miss ratio
Avoid caching user information in an online app
Take care of synchronization
Apache JCS, Oracle Coherence
Pool costly resources
Re-use costly to build resources
After use, check them into the pool instead of discarding
Connection
Costly to establish
Re-usable across users
Clean the resource before checking into the pool
-
8/9/2019 Programming for Performance
11/24
Application Architecture Keep a watch on the traffic
Multi-threaded model
Threads work on the same data
The data is not transferred between workers
Ideal when the job at hand involves huge data
Eg.
Windows registry
All processes work on the same registers
Different processes read/update different part of registry.
Data driven model
Data is transferred into workers queue
Independent chunks of data
Data size is small
Eg.
Order acceptance system
Order sent to worker to see item availability
Sent to next worker to process payment
Sent to next worker to confirm order
-
8/9/2019 Programming for Performance
12/24
Application Architecture Buffer data bottlenecks
Input/Output
File systems and other IO are slow
Not good at reading/writing a byte at a time
Read a chunk of data and pass it to application one byte at a time
Network
Sending/fetching data across network is slow and unreliable
Take youtube for example
Player doesnt fetch a frame at a time and show it to user Keep reading over the network whether video is playing or paused
Write the frames into a buffer
Player reads from the buffer and plays the video
Always buffer slow and unreliable peripherals
-
8/9/2019 Programming for Performance
13/24
Application Architecture Bulk Action
Bulk action is always cheaper than repeating it for each set of data
Common overhead is spread across the dataset
Uploading photos to Picasa
Authenticate user credentials and login
Establish a connection
Upload a picture
Close the connection
For uploading 10 pictures, you wouldnt repeat the four steps 10 times
Bulk upload Authenticate user credentials and login
Establish a connection
Upload first picture
Upload second picture
Upload last picture
Close the connection
Always ask for the bulk discount.
-
8/9/2019 Programming for Performance
14/24
Pseudocode and coding
Select the correct algorithm
The factor that can cause most dramatic change in performance How to compute the sum of all integers between m and n?
Fast Inverse Square Root
Newtons Method
x1 = xo - f(xo)/f'(xo)
Algorithms specific to the problem performs better than generic algorithm
-
8/9/2019 Programming for Performance
15/24
Comparisons of common sorting algorithm
Avoid using bubble and selection sorts
Use insertion sorts when the dataset is small Merge, heap and quick sorts are used as the algorithms in java
Use Arrays.sort() method
-
8/9/2019 Programming for Performance
16/24
Data structures
Arrays
Easy to loop Random access of elements
Insert and delete in the middle is difficult
Good at searching log(n)
Link Lists
Easy to loop Random access of elements is not possible
Insert and delete in the middle is easy
Bad at search log(n)
Binary Search Trees
Easy to loop Inserting an element is log(n) -> n
Searching is log(n) -> n
Self balancing structures like Red-Black has search and insert times of log(n)
-
8/9/2019 Programming for Performance
17/24
Java Collection framework
Has a collection of useful data structures
Collection
Set Collection of elements which doesnt have duplicates
HashSet
O(1) retrieval
TreeSet
Sorted set
O(1) retrieval
LinkedHashSet
Items maintained in the order of insert
List Collection of elements with duplicates possible
ArrayList
Array based, random access of elements is easy
LinkedList
Insert and delete is constant time operation
Maps Key-Value pairs
HashMap
Very good at insert, delete and retrieval
TreeMap
Supports traversal in the sorted order of keys
LinkedHashMap
The traversal is in the order of insert into the map
-
8/9/2019 Programming for Performance
18/24
Java Collection framework contd.
The structures are not thread safe, in order to eliminate the synchronization overhead
Methods to get read-only versions of the collection
Methods to get synchronized versions of the collection
Other historical collections
Arrays
Vectors
Resizable arrays
Synchronized
Hashtable
Older version of Map
Synchronized
-
8/9/2019 Programming for Performance
19/24
Keeping memory consumption low
As memory heap gets filled, garbage collections would be frequent
If there is no more memory to recover, application crashes Reduce number of objects created
Any operation on a non-mutable object could result in another object creation
Eg. Strings
Reduce the scope of the objects
Scrutinize class level objects Dont store references in long-lived objects
Avoid loading unnecessary classes
Avoid static linking to rarely used heavy libraries
Use java verbose to see classes loaded
C
ollapse smaller classes and anonymous classes into a single class
Multi-threaded application instead of multiple launches of the same application
-
8/9/2019 Programming for Performance
20/24
Reduce data traffic
Serialization with caution
Serialization is a great way to persist and recover states Loading serialized state could be faster than recreating the state
Control the fields you want to persist by using volatile keyword
Use of XML
XML is a great tool to exchange information in a platform neutral manner
XML takes considerable bandwidth on the wire Avoid unnecessary conversion of object to XML and back
Use the right parser
DOM vs SAX vs Stax
Evaluate other formats like JSON
Logging
Excessive logging is trouble
Never log to System.err or System.out
Use logging frameworks
-
8/9/2019 Programming for Performance
21/24
Responsive Application
Start-up quick with only the required resources
In a media player application, Start the application by fetching only the track lists
Lazy loading of costly resources
Fetch album art in the background
Always let the user know Be interactive
While the album art is loading, display a message
-
8/9/2019 Programming for Performance
22/24
Debugging tools
Benchmarking
Measurement of memory, time andC
PU usage of the application Compare benchmarks of different approaches
Profiling
Profiling tells more about your code execution paths
What methods are called often?
What methods are using the largest percentage of time?
What methods are calling the most-used methods?
What methods are allocating a lot of memory?
Profiling tools
VirtualVM
Netbeans Profiler
-
8/9/2019 Programming for Performance
23/24
Graceful Degradation
How should your application behave when to load is too much to handle?
System should never become completely useless System should never crash
The application could refuse to take new requests and display a message
In certain cases, its possible to degrade the quality of the results and stil l keep up the
response time
Eg. Search engines
Voice transmission over the network
In places where accuracy is crucial, this is not possible
Scientific modeling
-
8/9/2019 Programming for Performance
24/24
References
http://java.sun.com/docs/books/performance/1st_edition/html/JPTOC.fm.html
http://www.javaperformancetuning.com/tips/index.shtml http://en.wikipedia.org/wiki/Sorting_algorithm
http://java.sun.com/developer/onlineTraining/collections/Collection.html
Thank You