programming for performance

Upload: aloshbennett

Post on 29-May-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 Programming for Performance

    1/24

    Programming for Performance

    Alosh Bennett

  • 8/9/2019 Programming for Performance

    2/24

    What is performance?

    Can be expressed in terms of Computational Power

    Memory footprint

    Is perceived as

    Responsiveness of the system Throughput

    Scalability

    Startup time

    Other factors

    Reliability of the system

    Availability

    Overall feel of the system, its quickness and responsiveness.

  • 8/9/2019 Programming for Performance

    3/24

    How to build a performing system?

    What are you building?

    Web Application

    Application should load fast

    Faster request turn-arounds

    Minimal net traffic

    Reduce number of server calls

    Asynchronous updates

    Database Application

    Fast updates and queries

    Schema design

    Indexing

    Normalizing

    Caching results

    Connection pooling

    There is no generic formula to build a performing application.

  • 8/9/2019 Programming for Performance

    4/24

    Building an Application - stages

    Requirements Gathering

    Scope of the application

    What it is and what it is not.

    Real world estimation of usage

    Application Architecture

    Blueprint of the project

    Identify the technologies

    Components of the application

    Interaction

    Pseudocode

    Logic of individual components

    Algorithms and Data Structures

    Coding Coding standards

    Good practices

    When should we start thinking about performance?

  • 8/9/2019 Programming for Performance

    5/24

    Requirements Gathering

    Scope of the project

    Music player with media library

    Manage up to 5000 songs in the library.

    Play up to 5 songs simultaneously.

    Performance Benchmark

    Requires 1.8 GHz processor, 50 MB RAM, 20MB hard disk space

    Startup under 5 seconds

    What the application is not

    Site to show top events of the day.

    It will not show real-time data

    Data would be fetched only once in few hours

    Real world usage scenario

    Web application How many concurrent users

    Text Editor How big a file can it handle?

    Do one thing and do it well.

  • 8/9/2019 Programming for Performance

    6/24

    Real-Time update vs Periodic update

    Master Server

    Crawler

    Crawler

    Crawler

    Real-Time

    Updates

    Client

    Requests

    Master Server

    Crawler

    Crawler

    Crawler

    Real-Time

    Updates

    Clone

    Server

    Clone

    Server

    Clone

    Server

    Periodic

    Synch

  • 8/9/2019 Programming for Performance

    7/24

    Application Architecture Right tool for the Right job

    Blog

    jsp or php

    Online Transaction site

    Industry grade server Apache, glassfish, weblogic

    J2ee or similar framework

    Database mysql, oracle, postgres

    Rich content website

    Javafx, flex, htm5

    Avoid applets

    Task Automation

    Scripting languages python, perl, shell

    Avoid java, C

    Concurrent processing with multiple processors

    Scala over java

    Mathematical modeling Computation intensive

    Functional programming over Object Oriented

    XML Parsing in java

    DOM Parser

    SAX Parser

    Stax Parser

  • 8/9/2019 Programming for Performance

    8/24

    Application Architecture Harness the processing power

    Single threaded vs Multi threaded

    Picasa tool to upload pictures

    Single upload takes 5 seconds

    Single thread

    10 pictures take 50 seconds

    Multithreaded (5 workers)

    10 pictures take 10 seconds

    Identifying parallel tasks

    Localize - Break the application into independent units Parallelize Execute the units in parallel

    Picasa tool to resize and upload

    Resize takes 5 seconds, upload takes 5 seconds

    Single composite task (5 resize&upload workers)

    10 pictures in 20 seconds

    Two independent tasks (5 resize workers, 5 upload workers)

    10 pictures in 15 seconds

    20 pictures in 25 seconds

    Many hands make light work

  • 8/9/2019 Programming for Performance

    9/24

    Image Resize and Upload Parallel tasks

  • 8/9/2019 Programming for Performance

    10/24

    Application Architecture Dont repeat the effort

    Effective use of caching

    Cache results that are costly to re-compute

    Used effectively, improves the performance

    Eg. Currency conversion rates in a Forex calculator

    Avoid excessive caching

    Monitor cache hit/miss ratio

    Avoid caching user information in an online app

    Take care of synchronization

    Apache JCS, Oracle Coherence

    Pool costly resources

    Re-use costly to build resources

    After use, check them into the pool instead of discarding

    Connection

    Costly to establish

    Re-usable across users

    Clean the resource before checking into the pool

  • 8/9/2019 Programming for Performance

    11/24

    Application Architecture Keep a watch on the traffic

    Multi-threaded model

    Threads work on the same data

    The data is not transferred between workers

    Ideal when the job at hand involves huge data

    Eg.

    Windows registry

    All processes work on the same registers

    Different processes read/update different part of registry.

    Data driven model

    Data is transferred into workers queue

    Independent chunks of data

    Data size is small

    Eg.

    Order acceptance system

    Order sent to worker to see item availability

    Sent to next worker to process payment

    Sent to next worker to confirm order

  • 8/9/2019 Programming for Performance

    12/24

    Application Architecture Buffer data bottlenecks

    Input/Output

    File systems and other IO are slow

    Not good at reading/writing a byte at a time

    Read a chunk of data and pass it to application one byte at a time

    Network

    Sending/fetching data across network is slow and unreliable

    Take youtube for example

    Player doesnt fetch a frame at a time and show it to user Keep reading over the network whether video is playing or paused

    Write the frames into a buffer

    Player reads from the buffer and plays the video

    Always buffer slow and unreliable peripherals

  • 8/9/2019 Programming for Performance

    13/24

    Application Architecture Bulk Action

    Bulk action is always cheaper than repeating it for each set of data

    Common overhead is spread across the dataset

    Uploading photos to Picasa

    Authenticate user credentials and login

    Establish a connection

    Upload a picture

    Close the connection

    For uploading 10 pictures, you wouldnt repeat the four steps 10 times

    Bulk upload Authenticate user credentials and login

    Establish a connection

    Upload first picture

    Upload second picture

    Upload last picture

    Close the connection

    Always ask for the bulk discount.

  • 8/9/2019 Programming for Performance

    14/24

    Pseudocode and coding

    Select the correct algorithm

    The factor that can cause most dramatic change in performance How to compute the sum of all integers between m and n?

    Fast Inverse Square Root

    Newtons Method

    x1 = xo - f(xo)/f'(xo)

    Algorithms specific to the problem performs better than generic algorithm

  • 8/9/2019 Programming for Performance

    15/24

    Comparisons of common sorting algorithm

    Avoid using bubble and selection sorts

    Use insertion sorts when the dataset is small Merge, heap and quick sorts are used as the algorithms in java

    Use Arrays.sort() method

  • 8/9/2019 Programming for Performance

    16/24

    Data structures

    Arrays

    Easy to loop Random access of elements

    Insert and delete in the middle is difficult

    Good at searching log(n)

    Link Lists

    Easy to loop Random access of elements is not possible

    Insert and delete in the middle is easy

    Bad at search log(n)

    Binary Search Trees

    Easy to loop Inserting an element is log(n) -> n

    Searching is log(n) -> n

    Self balancing structures like Red-Black has search and insert times of log(n)

  • 8/9/2019 Programming for Performance

    17/24

    Java Collection framework

    Has a collection of useful data structures

    Collection

    Set Collection of elements which doesnt have duplicates

    HashSet

    O(1) retrieval

    TreeSet

    Sorted set

    O(1) retrieval

    LinkedHashSet

    Items maintained in the order of insert

    List Collection of elements with duplicates possible

    ArrayList

    Array based, random access of elements is easy

    LinkedList

    Insert and delete is constant time operation

    Maps Key-Value pairs

    HashMap

    Very good at insert, delete and retrieval

    TreeMap

    Supports traversal in the sorted order of keys

    LinkedHashMap

    The traversal is in the order of insert into the map

  • 8/9/2019 Programming for Performance

    18/24

    Java Collection framework contd.

    The structures are not thread safe, in order to eliminate the synchronization overhead

    Methods to get read-only versions of the collection

    Methods to get synchronized versions of the collection

    Other historical collections

    Arrays

    Vectors

    Resizable arrays

    Synchronized

    Hashtable

    Older version of Map

    Synchronized

  • 8/9/2019 Programming for Performance

    19/24

    Keeping memory consumption low

    As memory heap gets filled, garbage collections would be frequent

    If there is no more memory to recover, application crashes Reduce number of objects created

    Any operation on a non-mutable object could result in another object creation

    Eg. Strings

    Reduce the scope of the objects

    Scrutinize class level objects Dont store references in long-lived objects

    Avoid loading unnecessary classes

    Avoid static linking to rarely used heavy libraries

    Use java verbose to see classes loaded

    C

    ollapse smaller classes and anonymous classes into a single class

    Multi-threaded application instead of multiple launches of the same application

  • 8/9/2019 Programming for Performance

    20/24

    Reduce data traffic

    Serialization with caution

    Serialization is a great way to persist and recover states Loading serialized state could be faster than recreating the state

    Control the fields you want to persist by using volatile keyword

    Use of XML

    XML is a great tool to exchange information in a platform neutral manner

    XML takes considerable bandwidth on the wire Avoid unnecessary conversion of object to XML and back

    Use the right parser

    DOM vs SAX vs Stax

    Evaluate other formats like JSON

    Logging

    Excessive logging is trouble

    Never log to System.err or System.out

    Use logging frameworks

  • 8/9/2019 Programming for Performance

    21/24

    Responsive Application

    Start-up quick with only the required resources

    In a media player application, Start the application by fetching only the track lists

    Lazy loading of costly resources

    Fetch album art in the background

    Always let the user know Be interactive

    While the album art is loading, display a message

  • 8/9/2019 Programming for Performance

    22/24

    Debugging tools

    Benchmarking

    Measurement of memory, time andC

    PU usage of the application Compare benchmarks of different approaches

    Profiling

    Profiling tells more about your code execution paths

    What methods are called often?

    What methods are using the largest percentage of time?

    What methods are calling the most-used methods?

    What methods are allocating a lot of memory?

    Profiling tools

    VirtualVM

    Netbeans Profiler

  • 8/9/2019 Programming for Performance

    23/24

    Graceful Degradation

    How should your application behave when to load is too much to handle?

    System should never become completely useless System should never crash

    The application could refuse to take new requests and display a message

    In certain cases, its possible to degrade the quality of the results and stil l keep up the

    response time

    Eg. Search engines

    Voice transmission over the network

    In places where accuracy is crucial, this is not possible

    Scientific modeling

  • 8/9/2019 Programming for Performance

    24/24

    References

    http://java.sun.com/docs/books/performance/1st_edition/html/JPTOC.fm.html

    http://www.javaperformancetuning.com/tips/index.shtml http://en.wikipedia.org/wiki/Sorting_algorithm

    http://java.sun.com/developer/onlineTraining/collections/Collection.html

    Thank You