1
EM407Sound Application Development with SQL Anywhere
Mark CulpSoftware Development ManageriAnywhere [email protected]
2
Goals of this presentation
To help you develop applications that are robust, well designed, have good performance, and can scale with your database and number of
users To highlight some differences between 7.0
(and prior) and 8.0 that can affect can affect your application’s design
3
Contents
General considerations Schema design tips Application development tipsTechnical overview of Isolation levels Join semantics Cursor support in SQL Anywhere – 7.0 vs. 8.0
4
When should you think about performance and scalability?
During the design and planning stages Capacity planning Improving performance for deployed
databases
Earlier is better!
5
Common areas for performance problems
Physical database organization Database file characteristics Indexing considerations
Schema design Server characteristics
CPU, disk activity Network characteristics Application design
Query complexity Trigger design Locking Workstation processing (CPU, disk)
6
Schema Design
7
Schema design
Define your tables Normalize your data Entity/Relationship (ER) modeling
Define appropriate primary keys for all tables Helps in replication environments (reduces amount
of data in log file) Define appropriate foreign key relationships
Relationships needed to generate efficient join strategies
Define appropriate indexes Don't need to create indexes for PKs or FKs Don't over-do it ! Only define ones that are useful
8
Schema design:primary keys
Data administration issues: Ensure your application has complete control
over key assignment and usage Usually a very bad idea to update a primary
key (especially in a replication environment) It is exceedingly difficult to “hide” primary
keys from users (or your customers) Key formats are very difficult to change after
deployment
9
Schema design:primary key generation
Common problem: large, composite primary keys that are difficult to search efficiently
Both retrieval and update performance can suffer Consider surrogate primary keys; change existing
keys into unique constraints or secondary indexes Integer representation is the most efficient, for
both storage and indexing Permits the use of autoincrement PK column;
hence the server does all the work Global autoincrement can generate unique PK
values in a replicated system (now in UltraLite!)
10
Schema design:PK generation - autoincrement
Autoincrement scales very well; useful in many situations; highly recommended
Some disadvantages of autoincrement: Often wish to differentiate keys of different business
objects May desire randomized key generation for some
applications Alphanumeric values can aid in data consistency
during data entry Autoincrement cannot support self-checking
identifiers Think about these tradeoffs when deciding on identifier
data types
11
Schema design:PK generation – manual
Two other options: Manually assign key ranges
Cumbersome, but can work in some business environments
Most prone to data entry errors consider self-checking or alphanumeric
identifiers to reduce data entry problems Example: Canadian postal codes
e.g. N2L 3X2
12
Schema design:PK generation – key generation table
Create a separate “key generation” table, with one row per business object
To add a new key: Initiate a new connection Compute the next key using the existing one
as its basis Update the table, COMMIT immediately
Several disadvantages: logging, locking, possible contention
Avoid design that serializes lengthy transactions
13
Application Design
14
Application design:coding SQL statements
How you write a SQL statement does matter
Watch for join conditions involving user-defined functions, expressions, or type conversion
User defined functions that have queries in them tie the hands of the optimizer and can be inefficient
15
Application design:coding SQL statements
Simplify the query’s syntax if at all possible Select list aliases are useful to identify common
subexpressions (including subqueries) e.g. Select (X+10)/2 as quotient
Eliminate unnecessary predicates, DISTINCT processing, etc.
Rewrite the WHERE clause to contain as many conjunctive conditions as possible
Critical for 7.0 8.0 now automatically rewrites where clause
Don't replace outer joins with a subselect
16
Application design:programming considerations
PREPARE/DESCRIBE once during initialization (or on first use)
Remember to drop statements at termination - de-allocate statements with SQLFreeStmt()
Bind columns whenever possible use SQLBindCol() instead of SQLGetData()
Avoid COMMITing after every statement This is the default behavior for both JDBC and ODBC
Avoid DDL in applications (including TRUNCATE TABLE) to avoid implicit COMMITs or CHECKPOINTs
17
Application design:locking & blocking
Use cursor type appropriate to application function to reduce unnecessary locks
Use SQLSetStmtOption() to set cursor attributes
SQL_CONCURRENCY to read only SQL_CURSOR_TYPE to dynamic or forward-only
Use the BLOCKING option to specify whether or not an application blocks on a locking conflict, or receives an error
18
Application design:open … with hold
OPEN ... WITH HOLD only where appropriate All locks (except on the current row) are
released upon COMMIT; no guarantees about the state of the other rows
Semantics are unclear if ROLLBACK was issued Consider setting the option
ANSI_CLOSE_CURSORS_ON_ROLLBACK to force the closure of all cursors on a ROLLBACK statement
Usually there is little point to using WITH HOLD cursors at isolation levels 2 or 3
19
Application design:estimating result set size
Avoid doing so if at all possible Results will not be consistent in the face of
concurrent updates At OPEN, SQLCA (sqlerrd[2]) contains an
estimate of the result set size from the optimizer
Use SQLRowCount() in ODBC If positive, estimate is accurate at the time the
query was executed (i.e. single table scan) If negative, estimate is from the optimizer
20
Application design:estimating result set size
Use the ROW_COUNTS option to return an accurate result
For DYNAMIC cursors, query is executed twice Result may still change due to concurrent updates
Consider SCROLL or INSENSITIVE cursors instead Result is computed only once INSENSITIVE: result set size is fixed at OPEN SCROLL: perform a FETCH ABSOLUTE n where n is
“large enough” to force materialization of the entire result
sqlerrd[2] contains (n – result set size)
21
Application design:use of prefetch
Prefetch is designed to reduce communication in a client-server environment by transferring sets of rows to the client in advance of a FETCH request
Prefetch is ON by default To disable outright: use the
DisableMultiRowFetch connection parameter or set the Prefetch option to OFF
8.0 turns off prefetch on cursors declared sensitive
22
Application design:use of prefetch
Avoid the use of prefetch for updateable cursors at isolation level 1
In 7.0, cursor stability is not guaranteed, as the client and the server are positioned on different rows
In 8.0, new options to guarantee cursor stability
In ESQL, use BLOCK n to limit the number of rows prefetched for each FETCH request
If n is 0, prefetch is disabled
23
Application design:use of prefetch
Two new connection string parameters in 7.0.1: PreFetchRows and PreFetchBuffer
can specify prefetch row limit and prefetch buffer size, respectively, on a per-connection basis
Prefetch may decrease performance if: Application requires fewer rows than the
prefetched amount Application performs FETCH ABSOLUTE or scrolls
randomly through the rowset
24
Application design:bulk fetches
For relatively large result sets, use wide (multi-row) fetches
each request to the server obtains several rows
6.x, 7.x, and 8.0 prefetching automatically improves bulk fetch performance
Application gives up cursor stability if prefetching is used at isolation level 1(7.0 and earlier)
25
Application design:bulk inserts
Use LOAD TABLE where appropriate Use wide (multi-row) inserts COMMIT at regular intervals to limit size of
rollback log 7.x & 8.0 uses large I/O’s to grow database
and transaction log
26
Application design:improving performance
Objective is to reduce the work the engine is being asked to perform
Reduce number of requests sent to the server Use procedures for multi-part transactions Consider populating and then join to a temporary
table if needing to select multiple specific rows SELECT list should only contain attributes
required by the application Use declared temporary tables instead of
permanent tables to avoid logging, locking
27
Application design:improving performance
Java in the database Tradeoff how Java objects are fetched
often more flexible: fetch entire objects often more efficient: fetch individual object values
Minimize the need to deserialize Java columns use indexes to reduce the scan factor of tables
containing Java objects For OLTP workloads, minimize the number of rows that
need to be processed ensure that indexes can be used for ORDER BY,
GROUP BY and DISTINCT
28
Application design:improving scalability
Be aware of the length of time your application will hold a lock on any specific row or table
Request-level logging tool can assist in pinpointing performance problems
PERL script to analyze log output Can identify the most expensive SQL queries on a
server-wide basis Use TRANTEST to simulate production workload Use DBTRAN to review the transaction log to
identify unnecessary updates, places to use temporary tables
29
Application design:use of compression (new in 8.0)
Communication compression may improve between client and server over a modem or WAN:
Enable using Compress=YES in client connection string, or –pc server command line switch
Packets are compressed before encryption Compressed data can be less than 10% of original size,
but depends completely on data and the application Consider increasing packet size to achieve greater
compression and less number of packets Compression requires additional ~46K per connection You must analyze your application's throughput
30
Some Technical Details
31
Isolation levels
Isolation levels only affect behavior of read requests from other connections/transactions; writes are always locked
Isolation levels for read requests: 0 (default) - no locking; a latch ensures that the entire
row is consistent when retrieved from the disk page 1,2 - lock rows in the query’s result, but with level 1
the lock is held only while the cursor is on that row 3 - lock every row read and every insertion point
crossed during query execution Add/remove foreign key row requires a read lock on the
primary row
32
Isolation levels: recommendations
Use the isolation level that offers your application the best trade-off of consistency with concurrency
NB. nothing is guaranteed at level 0 (“dirty read”)
For isolation level 3, ensure the server can exploit the use of indexes to limit the amount of locking performed
Specify ISOLATION LEVEL on a cursor basis instead of modifying the option setting
33
Join semantics
ASA supports: INNER LEFT OUTER RIGHT OUTER CROSS (Cartesian product) FULL OUTER (new to 8.0)
Variants: KEY: generates FK-PK equijoin condition NATURAL: generates equijoin condition over
columns with identical names
34
Join semantics: some details
7.x release introduces restrictions on permitted syntax Restrictions on outer references in an ON condition New option: Extended_join_syntax
By default, table expressions are left-deep nested R Join S Join T = (R Join S) Join T
“Star” joins are specified by using the same table reference (correlation name) more than once
Select *
From R Left Outer Join S On (R.x = S.x),
R Left Outer Join T On (R.x = T.x and T.y = 25)
35
Join semantics: some details
Semantics of KEY joins are complex Verify that your query is producing the intended
result See documentation for details
36
Join semantics: new to 8.0
Support for FULL OUTER JOIN In addition to naïve nested loop join, 8.0
introduces new join operators nested block, hash, sort merge if no ORDER BY by clause specified, order of
results may vary depending on chosen plan; so don't make any assumptions
dbisql's query plan viewer can be used to visually display plan chosen for complex joins
See AM33: ASA Internals
37
Cursors
ESQL cursor types: no scroll, dynamic scroll (default), scroll,
insensitive sensitive (new to 8.0)
ODBC cursor types static, dynamic, keyset, mixed, forward-only
(default)
38
Cursors semantics
Cursor semantics are dependent on: Membership sensitivity Value sensitivity Scrollability (forward only or scrollable) Updatability (read-only or updateable)
39
Cursors – membership sensitivity
Membership sensitivity: Insensitive: result rows are fixed at open; no
changes Repeatable: result rows will not change once
fetched Sensitive: result rows will change with respect
to concurrent inserts, deletes, and updates Asensitive: result rows may or may not change
depending on update activity and chosen plan
40
Cursors – some definitions
Value-sensitivity (VS): Insensitive: data values will not change once
fetched Sensitive: data values will change with respect
to concurrent updates Asensitive: data values may or may not
change depending on update activity and chosen plan
41
Cursor combinations
Row Membership
Asensitive Sensitive Repeatable
Insensitive
Values
A ODBC unspec.,ESQL dynamic
n/a ESQL scroll n/a
S n/a ODBC dynamic,ESQL sens.
n/a n/a
I n/a ODBC keyset n/a ODBC static,ESQL insens.
Cursor type can be altered by server to be more restrictive than what was requested
42
iAnywhere Solutions Highlights
• Ask the Experts - about Mobile & Wireless Solutions-Mezzanine Level Room 15B Mon./Tues. 11:30 am - 3:30 pm; Wed. 11:30 - 1:30;Thurs. 9 am - 12 noon-Exhibit Hall - Demo Center (truck) exhibit hall hours
• SIG (Special Interest Group)- Tuesday 5:30pm Mobile & Wireless SDCC, Upper level, Room 11
• Keynote - Enabling m-Business SolutionsWednesday 1:30 pm - 3:00 pm
• iAnywhere Solutions Developer Community-Excellent resource for commonly asked questions, newsgroups, bugfixes, newsletters, event listings - visit www.ianywhere.com/developer