best practices for boosting java application performance ... · pdf file#idug best practices...
TRANSCRIPT
#IDUG
Best Practices for Boosting Java Application Performance and Availability on IBM DB2
Pallavi PriyadarshiniArchitect, JCC – DB2 Connect, IBM
#IDUG
Agenda
DB2 JDBC driver architecture and ecosystem
API details
Best Practices (for Type 2 and Type 4)–Connection Management–SQL Execution–DB2 Dynamic Statement Cache–Batching and Multi-row operations–Miscellaneous performance tips
3
#IDUG
Driver Background, Architecture and Integration
4
#IDUG
JDBC
Java Database Connectivity– API for the Java programming language that defines how a client may access a database– Provides methods for querying and updating data in a database– JDBC is oriented towards relational databases– Portability of applications across databases and platforms
The API is defined in the classes and interfaces that make up the Java packages:– Java package “java.sql”
• JDBC core API– Java package “javax.sql” - extension providing
• DataSource interface• Connection pooling • Distributed transactions • Rowsets
5
#IDUG
Universal driver - JCC (Java Common Client)
Abstract JDBC processor – common architecture independent of platform, connectivity type or JDBC/SQLJ access
Cross-platform– Provides java connectivity to all DB2 servers (Linux, Unix, and Windows, z/OS, iSeries), Informix
Different connectivity paths (local and remote)– Type 4 distributed – Pure Java standalone remote client (DRDA)– Type 2 local - Fastest path when executing on same machine as DB2 engine. Avoids costly
construction/parsing of DRDA command stream• Type 2 zOS – Native RRSAF-based; Java and native C• Type 2 LUW – Native CCC based; Java and native C
One driver is loaded and type 2 and type 4 connections can be obtained (type 2 requires proper native libraries)
Supports JDBC 3 under JRE 1.4 (db2jcc.jar), and JDBC 4 under JRE 1.6 (db2jcc4.jar)
Provides behavioral consistency, API consistency and code reuse between connectivity types and platforms (minimize divergence)
6
#IDUG
Universal Driver Architecture
Abstract machine (am)/requestorJava common layer
T4 subsystem builds/parsesDRDA requests/replies
T2z subsystem calls intonative z/OS APIs
SQLJ runtime
T2u subsystem calls native LUW APIs
JDBC Application
CCC
RRS
DB2, Informix
DRDA overTCP/IP
local UDB
DB2 for z/OS
SQLJ Application
7
#IDUG
Driver Technologies
Full function JDBC driver, much more than a JDBC driver
High availability Workload balancing, Client Re-route (failover)
Scalability Extensive multi-threading
Performance Connection reuse/efficient socket mgmt, DRDA Chaining/Batching,Defer Prepares, Statement Caching
Security Encryption/authentication (SSL, Kerberos, DB2 security)
Monitoring Extended diagnostics, API for system monitoring
Robust Application Functionality
Internationalization, data conversion, robust result set, Parm metadata, multiple connection options, Distributed Transactions
New Data Types BIGINT, TIMESTAMP, XML/BLOB/CLOB with multiple streaming options
8
#IDUG
Integration
Ships with BM data servers. They drive most requirements– DB2 on z/OS. Includes both T4 and T2 zos drivers– DB2 on LUW. Includes both T4 and T2 UW drivers– DB2 on System I– Informix connectivity (Type 4)
Integration with IBM products– Websphere products are primary consumer and drive a lot of requirements– Many other internal groups like Cognos, Tivoli, Sterling Commerce, Content Manager– DB2 tooling like RAD– Optim tools portfolio like Data Studio/OPM/OCM/pureQuery/OQWT
Certification/Testing by non-IBM products SAP, Weblogic, JBOSS
High profile customers – Used by top banks, retailers, insurance companies
9
#IDUG
Connection Management
10
#IDUG
Client configurations – Distributed Connectivity
Non-Java based Clients
JDBC/SQLJ/pureQuery/
CLI/ Ruby/ Perl
.NETDRDA
Direct Connect (Type 4)Java
based Clients
Direct Connect
DRDA DB2 Connect Server
DB2 Connect Client DRDA
DB2 Connect Clients no longer require DB2 Connect Server
DB2 Connect Clients are Sysplex enabled with improved availability
Based on DRDA - Distributed Relational Database Architecture– supports direct connectivity to DRDA servers over TCPIP– eliminates intermediate servers required by other driver types
DB2 z/OS
DB2 Group
DB2 z/OS
CF
11
#IDUG
Getting a connection - DriverManagerDriver name “com.ibm.db2.jcc.DB2Driver”
JDBC 1 java.sql.DriverManager API – The actual driver type determined during runtime from the connection URL format:
• Type 2 - “jdbc:db2:database”• Type 4 - “jdbc:db2://host:port/database” (default port number 446)
Disadvantage - Reduces portability because the application must identify a specific JDBC driver class name and driver URL.
Class.forName(“com.ibm.db2.jcc.DB2Driver”); Connection con = DriverManager.getConnection(“jdbc:db2://localhost:50000/sample”, “username”, “password”);
12
#IDUG
Getting a connection - DataSource
JDBC 2 - javax.sql.DataSource API
Preferred approach
Increases application portability by using a logical name– Logical name mapped to DataSource object via JNDI naming service– logical name -> driver info, DB name, IP, port, user, password, etc.
Application servers can be used to configure and manage data sources
Properties of DataSource object modified if any DataSource information changes. Eg. if data source moved to different server, property for the server can be changed.
No change to application code necessary.
Connection obtained through a DataSource object is identical to a connection obtained through the DriverManager facility.
– DataSource also has more flexibility in specifying Properties
13
#IDUG
Getting a connection (DataSource examples)
• Non-WAS datasource example
• WAS datasource example - Application requests DataSource object by logical name
Context ctx=new InitialContext();DataSource ds=(DataSource)ctx.lookup("jdbc/sampledb");Connection con=ds.getConnection();
DB2SimpleDataSource ds = new DB2SimpleDataSource();ds.setServerName(xxx);ds.setPortNumber(xx);ds.setDatabaseName(xxx);...Connection con=ds.getConnection();
14
#IDUG
Properties JDBC API defines a set of properties to identify and describe a DataSource implementation.
May be specified in any of three ways– JDBC 1 connection properties passed as a java.util.Properties object (consisting of key/value pairs)
as argument to DB2Driver.connect() or java.sql.DriverManager.getConnection() – As part of the database URL itself
• jdbc:db2://server[:port]/databaseName[:propertyKey=value;...]• jdbc:db2://localhost:50000/TESTDB:user=foo;password=bar;
– Using setXXX methods• most properties are defined in the abstract com.ibm.db2.jcc.DB2BaseDataSource class
Examples of properties - ProgressiveStreaming, deferPrepares, loginTimeout, keepdynamic, enableSysplexWLB, currentExplainMode, cursorSensitivity, maxTransportObjects, traceFile, traceLevel, currentSchema, currentSQLID, dumpPool
Driver configuration tuning via external properties (not requiring application or DataSource modification) is available
15
#IDUG
Global PropertiesGlobal driver properties can be provided through Java system properties or a properties file.
If both are set, Java system properties will override any property setting from the properties file.
File name of the properties file can be provided through the system property "db2.jcc.propertiesFile". If that property is not set, first "DB2JccConfiguration.properties" and then "com/ibm/db2/jcc/DB2JccConfiguration.properties" will be the assumed file name.
The global property names must be prefixed with one of the following:– db2.jcc. (eg. db2.jcc.traceFile)– db2.jcc.default. (eg. db2.jcc.default.traceFile)– db2.jcc.override. (eg. db2.jcc.override.traceFile)
The properties prefixed with “db2.jcc.” or “db2.jcc.default.” will be only used if and only if they are not set on connection or datasource level.
To override the properties set on connection or datasource level anyway, use another set of global driver properties prefixed with “db2.jcc.override.”
16
#IDUG
Getting a Connection
All DB resources hang off Connection object, must be managed carefully
Expensive operation - Average connection object is 1-2M in size and contains a great deal of information on connection context
Creating and terminating connections is time consuming operation both in the driver and DB2 and can slow down applications
Client obtaining a physical database connection requires multiple network requests to
– Handshake on drda protocol– Validate client user credentials– Establish code page, packet size setc.
17
#IDUG
Getting connections efficiently
Get a connection object one at a time, as needed
Reuse a connection object for multiple Statement objects when possible
Close connections promptly, don't leave connection cleanup to garbage collection
Add another close in the finally block to cover the case when exception occurs before connection close statement
Apply close to other JDBC objects too– Statement– PreparedStatement– CallableStatement– ResultSet
18
#IDUG
Clean up example
public test () throws SQLException { Connection con = null; ResultSet rs = null; Statement stmt = null;try { con = ds.getConnection(); stmt = con.createStatement(); rs = stmt.executeQuery(... // If an exception occurred here, then the next 3 closes are not executed. rs.close(); // Explicit close in same method stmt.close(); con.close(); } ...finally {// Close again in finally block, in case there was an exception before explicit rs.close() try {if (rs != null) rs.close(); } catch (Exception e) {} try {if (stmt != null) stmt.close();} catch (Exception e) {} try {if (con != null) con.close();} catch (Exception e) {} }}
19
#IDUG
Connection poolingA Connection Pool is a set of client established physical database connections
that can be reused by the application
Part of DataSource support – Connection object pool maintained by Application Server like Websphere
Allows cost of connection requests to be spread out over multiple requests and can significantly improve performance
Saves creating/destroying Connection objects which is relatively expensive–Without connection pooling
• each Connection represents physical connection and close terminates the physical connection
–With connection pooling• Connection is a logical representation• Cost of creating each physical database connection is incurred only once• Subsequent getConnection from the pool is fast• Completely transparent to the application
20
#IDUG
Connection pooling
Transport 1
Transport 2
Application Server
Application
ResourceAdapter
JCAConnectionManager
DBConnection
Pool
DB2 Universal Driver JDBC/SQLJ
LogicalConnection
3
LogicalConnection
1
LogicalConnection
2
disconnectat commit/rollback
CF
pooled connectionsto DB2 Data Sharing
Group
JVM
21
#IDUG
SQL Execution
22
#IDUG
Creating a Statement JDBC connections support creating and executing statements
– Update statements such as CREATE, INSERT, UPDATE and DELETE• Returns update count indicating how many statements updated on the database
– Query statements such as SELECT.• Returns JDBC result set which has metadata describing names of columns/types
– Stored procedures
JDBC represents statements using one of the following classes:– Statement – the statement is sent to the database server each and every time. – PreparedStatement – the statement is cached and then the execution path is pre
determined on the database server allowing it to be executed multiple times in an efficient manner.
– CallableStatement – used for executing stored procedures on the database.
23
#IDUG
Prepared Statement Cache or Statement Pool
Client side optimization
Pool of PreparedStatement and CallableStatement objects not active in a Connection
– Application requesting PreparedStatement/CallableStatement may get from pool– Application closes object returns object to the pool
Reduced overhead of Java object delete and re-create
Also saves on description cost
The statement cache exists for the life of an open connection. When connection is closed, driver deletes the statement cache and closes all pooled statements.
Least recently used algorithm if cache is full
No impact to application
24
#IDUG
Statement or PreparedStatement
PreparedStatement over Statement– 2 DB calls are needed for fetch – one for describing column, and other for data– PreparedStmt makes description calls at construction time, Statement makes them on
every execution.– PrparedStatement enables Statement Pooling– Use Statement when SQL is not executed often
Close explicitly before closing connections
Reuse these objects when possible– One Statement object can be used to execute multiple SQLs instead of creating
multiple Statement objects
25
#IDUG
Statement or PreparedStatement
Statement example
PreparedStatement example
Statement stmt = con.createStatement();stmt.executeUpdate("INSERT INTO EMPLOYEE VALUES( 'John', 123 )");stmt.executeUpdate("INSERT INTO EMPLOYEE VALUES( 'Mary', 425 )");
PreparedStatement ps = con.prepareStatement(“INSERT INTO EMPLOYEE VALUES(?, ?)”);ps.setString(1,”John”);ps.setInt(2, 123);ps.executeUpdate();ps.setString(1,”Mary”);ps.setInt(2, 425);ps.executeUpdate();
26
#IDUG
DB2 Dynamic Statement Caching
27
#IDUG
DB2 SQL – Static vs. DynamicAccess path is the set of instructions that DB2 uses to locate data under SQL
DB2 Optimizer component:– Generates alternative access paths– Calculates cost for each access path– Chooses cheapest access path
Select * from T where A='1' and B='2'– CREATE INDEX IDX_A on T(A)– CREATE INDEX IDX_B on T(B)– Alternative access paths
• Path 1: Use no indexes• Path 2: Use IDX_A only• Path 3: Use IDX_B only• Path 4: Use both indexes
Access path selection is expensive
Static SQL – Prepared at bind time, execution is fast
Dynamic SQL– Prepared at run time– Complete sql stmt string unknown before execution– Popular as distributed queries from non-mainframe platforms use dynamic sql (eg. JDBC)
28
#IDUG
Dynamic Statement Cache
Feature to improve performance of dynamic sql
Allows applications (multiple threads) to reuse and share prepared statements. Cache pool shared by different threads, plans and packages (“global cache”).
Prepared stmt is saved in an in-memory cache
Subsequent prepares of same statement loads from cache
PREPARE Flavors– Full prepare – Statement is not found in the cache and must be prepared from scratch– Short prepare – Statement is found in the cache and can be reused; no need to regenerate access
path
Good cache hit rate produces significant performance benefits
A Full Prepare can be 10-100X more CPU than a Short Prepare!
29
#IDUG
Cache match criteria
Sql stmt text – code SQLs with same string
Prepare attributes (if for a cursor select)
Bind options - CURRENTDATA, DYNAMICRULES, ISOLATION, SQLRULES, QUALIFIER
Authorization id
Special registers - CURRENT DEGREE, PATH, SCHEMA, RULES, PRECISION, REFRESH AGE etc.
Use parameter markers (?) instead of literals to improve cache hit rate
Cached statements reside in the cache until:– DROP or ALTER of any object– Authorization Revoked– LRU – Least Recently Used– RUNSTATS
SELECT * from T where empid = 1 SELECT * from T where empid = 1 (2 entries in cache)
– SELECT * from T where empid = 1 – SELECT * from T where empid = 2 (2 entries in cache)– VS.– SELECT * from T where empid = ? (one entry in cache)
30
#IDUG
EXPLAIN
EXPLAIN facility can be used to provide information on access plans– Populates various explain tables with details on access path information of statements
Use Dynamic SQL Stmt section of statistics to monitor the Global Cache Hit Ratio % to determine if the cache size needs to be increased.
31
#IDUG
Literal replacement
Introduced for improved cache hit for statements with literals
Automatically replace literal with ampersand (&)
& treated as a parameter marker (?)
Set Connection or DataSource Property statementConcentrator to STATEMENT_CONCENTRATOR_WITH_LITERALS (2)
Eg
Select * from T where productId = '1'Select * from T where productId = '2' Results in 1 cache entry
Select * from T where productId = &
32
#IDUG
Why SQLJ?
Static SQL performance for Java applications– Consistent performance – Performance advantage over JDBC
Static SQL authorization model– Provides Java with a stronger authorization model
Monitoring/Manageability– Static SQL packages for accounting/monitoring– Static SQL locks in access path, so that access path changes do not occur without a conscious choice
Measurements with the IRWW workload comparing JDBC vs SQLJ with the T2 driver
Throughput (ETR) Normalized Throughput (ITR)z/OS CPU Utilization CL.1 CPU time
JDBC T2 2636.83 3773.37 69.88 0.000672
SQLJ T2 2694.80 (+2.20%) 5174.35 (+37.13%) 52.08 (-25.47%) 0.000457 (-32.00%)
33
#IDUG
Multi-Row and Batch operations
34
#IDUG
Batching INSERTs
DB2 –Execute one Multi-row INSERT instead of many individual
INSERTs
JDBC API –enableMultiRowInsertSupport=true (default) Connection or DataSource
property–Eg.
INSERT INTO EMPLOYEE (EMPID) VALUES (:HV) FOR :NUM-ROWS ROWS- HV is host variable array- NUM_ROWS are # rows to be inserted
ps = conn.PrepareStatement(“INSERT INTO EMPLOYEE VALUES (?)”)ps.setInt(1)ps.addBatch()ps.setInt(2)ps.addBatch()Int[] returncodes = ps.executeBatch()
35
#IDUG
Batching SELECTs
Not supported by JDBC spec – executeBatch with SELECT returns exception
DB2 driver extension – executeDB2QueryBatchPreparedStatement ps = conn.prepareStatement(“SELECT * from T1 where C1 = ?”) ps.setInt(1,1) ps.addBatch() ps.setInt(1,2) ps.addBatch() ((com.ibm.db2.jcc.DB2PreparedStatement)pstmt).executeDB2QueryBatch(); While (ps.getMoreResults()) { Rs = ps.getResultSet() While (rs.next()){ } rs.close}ps.close
36
#IDUG
Multi-Row Fetch
Returns many rows in one SQL (ROWSET)
EnableRowsetSupport = 1 for Type 2 or Type 4
Scrollable cursors for Type 4
Scrollable or forward-only cursor for Type 2
((com.ibm.db2.jcc.DB2BaseDataSource)ds).setEnableRowsetSupport(com.ibm.db2.jcc.DB2BaseDataSource.YES);java.sql.Statement s = con.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_READ_ONLY);s.setFetchSize(1000); //Set hint for fetch size – number of rows per blockString selectString = "select id, name from T";ResultSet rs = s.executeQuery(selectString);while (rs.next()) {}
#IDUG
Block fetch
DRDA block fetch for forward only, read only cursor
Returns multiple rows in query block reply message– Query Block size can be tuned to 256K– Large block size improves performance, requires more storage on client to buffer
You can expect significant performance improvement for applications with the following queries
– Queries that return LOBs– Queries that return more than 1 row
37
These measures show the following results:
•Limited block fetch increases throughput by 169%
•Limited block fetch reduces class 1 CPU time by 63%
38
#IDUG
TransactionAutocommit – default is TRUE
Every SQL followed by implicit commit which means disk I/O
Set Autocommit to FALSE
Code COMMIT at frequent intervals to control when disk I/O happens
Example 1 – many disk I/Os
Example 2
con.setAutoCommit(true)Stmt = con.createStatement()stmt.execute(“insert...”) //Server issues an implicit COMMITstmt.execute(“insert...”) //Server issues an implicit COMMIT
con.setAutoCommit(false)Stmt = con.createStatement()stmt.execute(“insert...”) stmt.execute(“insert...”)stmt.commit()
39
#IDUG
Common Java performance problem and tuning Java application
– Mismatch of Java and DB2 data types • Use standard mapping between java/JDBC types and SQL types in getXXX and setXXX (eg.
java.sql.Timestamp to SQL TIMESTAMP)– Retrieval of unused columns (select * )
JDBC– JDBC resources not closed (cursor, statements, connections). – No usage of Parameter Markers - E.g. select c1, c2 FROM t1 WHERE c3=?– Usage of Statement() instead of preparedStatement() objects
Environment– JVM heap too small– Old System levels (JDK, JDBC/SQLJ driver)– DB2 Statement Cache not tuned well
Use stored procedures Avoid -
– Held cursors are held across commit by default, can cause unwanted contention/issues– DGTT’s that are not dropped/released– Held LOB locators
Only prepare what will be executed - No deferred prepares under certain conditions
40
#IDUG
Useful links
http://www.bwdb2ug.org/PDF/Dynamic_Statement_Cache_in_a_Nutshell.pdf
https://www.ibm.com/developerworks/community/blogs/22586cb0-8817-4d2c-ae74-0ddcc2a409bc/entry/december_17_2012_6_07_am3?lang=en
http://www.ibm.com/systems/z/specialtyengines/