Download - Performance Tuning for Developers and DBA
-
8/10/2019 Performance Tuning for Developers and DBA
1/40
Oct 5th 2009 4pm
Platform: z/OS
Kurt Struyf
Competence Partners
Session: E03
Practical SQL performance tuning,
for developers and DBA
-
8/10/2019 Performance Tuning for Developers and DBA
2/40
2
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
3/40
3
Static SQL One SQL = One access path
SELECT FROM WHERE NAME BETWEEN :HV1-LOW AND :HV1-HIGH
AND FIRSTNAME BETWEEN :HV2-LOW AND :HV2-HIGHAND BIRTHDATE BETWEEN :HV3-LOW AND :HV3-HIGHAND ZIPCODE BETWEEN :HV4-LOW AND :HV4-HIGH
Our table has 4 indexes :
IX1 on NAME
IX2 on FIRSTNAMEIX3 on BIRTHDATE
IX4 on ZIP CODE
AT BIND TIME DB2 CHOOSES IX3
Step1
Step2
Step3
AT RUN TIME
User only fills out a value for ZIP CODE
Step4
AT RUN TIME DB2 USES IX3
which doesnt filter anything ONE SQL = ONE access path
Step5
DB2 determines for each SQL statement the best way to resolve the query. The result
of this calculation is the access path. If it is a static SQL statement, this access path
will be chosen at bind time. As a general rule we can say that after DB2 has chosen
this, it wont change this access strategy at execution time. Even if at run time certain
other access path choices would have been better. This is somewhat simplifying the
truth, but is in most cases accurate.
In the example on the slide this is explained by a very simple static query.
Our table has 4 indexes. Our select has on all 4 columns a between. If nothing is
filled out by a user, the host variables are low value and high value. If the user
provides a value both host variables for that column hold the provided value.
At bind time, DB2 chooses IX3 as the best possible access path, with the known
parameters at that time.
IF at execution time our user doesnt fill out a value for COL3, but he does provide a
value for COL1. DB2 doesnt change his access path to IX1, but uses IX3, whichdoesnt filter anything.
Well explain more later, the purpose here is just to explain that DB2 chooses one
access path and sticks to it. This access path can be a cheap access path or a more
expensive access path. But DB2 estimates that within the parameters at bind time it is
the cheapest.
-
8/10/2019 Performance Tuning for Developers and DBA
4/40
4
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
5/40
-
8/10/2019 Performance Tuning for Developers and DBA
6/40
-
8/10/2019 Performance Tuning for Developers and DBA
7/40
7
Matching columns is an indication of how well an index is used,
- more matching columns better index use
- always start with first index column- on = and on one IN you can continue
Example: Index on (Name, Clientno, Salarycode)
Predicate Matching--------------------- -------------1. Name = Smith AND
Clientno = 20 ANDSalarycode = 56
2. Name = Smith ORClientno > 20 ANDSalarycode = 56
3. Name IN (Smith, Doe)AND Clientno > 20AND Salarycode = 56
Matching Columns
Predicate Matching--------------------- -------------4. Name IN (Smith, Doe)
AND Salarycode > 0
5. Name SmithAND Clientno = 56
6. Cliento = 56AND Salarycode = 0
-
8/10/2019 Performance Tuning for Developers and DBA
8/40
8
Stage 1 Keep it positive and simple but no index!
= : equal to
> : larger then
= : larger then or equal to
-
8/10/2019 Performance Tuning for Developers and DBA
9/40
9
Stage 2 All the rest !! All functions such as
SUBSTR
CONCAT
CHAR()
Mismatching data types
Colchar_6 = 1234567
Host variable checking
AND :HV1 = 5
Decryption
Current date between col1 and col2 Sorting
DB2
RDS
DM
Index Index
Stage1
Stage2
All functions require by definition more procession power then what the data
manager is capable of providing, and so they are resolved in stage 2.
This functions also include any mathematical function such as adding and subtracting
with a column.
Mismatching data types, this is a bit more complex. As a general rule of thumb, you
can say that, when the data type of a the host variable doesnt match the data type of
the column. The predicate is stage2. This is cutting it a bit short, you could also say
(and is more correct) if the host variable is bigger than the column data type the
predicate is stage 2. Many exceptions exist, but best is to use the correct data type.
Host variable checking is done in stage2 and this should NEVERbe done in SQL and
should always be done in COBOL.
-
8/10/2019 Performance Tuning for Developers and DBA
10/40
10
In COBOL checking stage 3 NEVER (ab)use this!
All DB2 columns that
CAN be checked in SQL
SHOULD be checked in SQL
So BETTER
a Stage2 predicate then NO predicate
DB2
RDS
DM
Index Index
Stage1
Stage2
IN COBOL
Being said that stage2 is expensive doesnt mean that you should use them.
If indeed the only way to write the predicate on a COLUMN is as a stage2 predicate,
you should write it as a stage2 and not pass the row on to COBOL and check it in
COBOL, that obviously is even more expensive. If such a thing as stage3 would
exist this would be it.
-
8/10/2019 Performance Tuning for Developers and DBA
11/40
11
Index, Stage1, Stage2DB2
RDS
DM
Index Index
Stage1
Stage2
This time around you should understand this slide. And know that there are more and
less expensive ways to writing a query, depending on where DB2 can resolve its
where predicates. And how many rows are filtered as early (index) on as possible and
how many are carried on to stage1 or even stage2.
-
8/10/2019 Performance Tuning for Developers and DBA
12/40
12
SQL processing
DM (stage1)
1) matching index predicates (when the index is accessed)
2) other indexable stage 1 predicates (index screening)
3) non indexable stage 1 predicates on index pages
4) stage 1 predicates on the data
5) rows passed to RDS
RDS (stage2)
1) stage 2 predicates2) sort
Selected rows passed to the user
DB2 resolves its where predicate always in the same manner.
First it will resolve the matching index predicates in the sequence of the index
columns
Secondly it will resolve all the screening predicates in the index
Thirdly DB2 will resolve all non indexable where predicates, that are stage 1 and can
be resolved in the index pages
Fourth, DB2 will resolve all stage1 predicates on the data
Then all stage2 predicates are resolved and lastly all returning rows are sorted.
-
8/10/2019 Performance Tuning for Developers and DBA
13/40
13
Order of evaluating predicates
Within each of the above non index steps :
1) all equal predicates
2) all range predicates and col IS NOT NULL
3) all other predicates
Within each of the above sub-step :
the order in which they appear
Within all the non index steps of the previous slide, the same logic is followed.
E.G step 4 stage1 on data pages :
First DB2 will resolve all equal predicates
Secondly all range predicates
Thirdly all the rest (e.g. not equal to)
Within each sub step, the order in the SQL statement is followed. That means that if
we for example have two equal predicates that we have to resolve in the data pages,
DB2 will take the physical sequence in the SQL statement to determine the order in
which to resolve the predicates.
Well explain with a little example on the next slide
-
8/10/2019 Performance Tuning for Developers and DBA
14/40
14
ExampleSELECT *FROM MYTABLEWHERE C1 > ? 1 i ndexAND : HV 5 6 st age2
AND C5 = ? 3 st age1AND C4 = ? 4 st age1AND DATE( C2) < ? 5 st age2
AND C3 = ? 2 i ndexORDER BY C2 7 st age2
INDEX (C1, C3)
-
8/10/2019 Performance Tuning for Developers and DBA
15/40
15
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
16/40
-
8/10/2019 Performance Tuning for Developers and DBA
17/40
17
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
18/40
18
Select * SELECT * almost never to be used SELECT ONLY COLUMNS that are
needed !
Reason :
Program maintenance
CPU cost per extra column
SORT file becomes bigger Maybe not index only
-
8/10/2019 Performance Tuning for Developers and DBA
19/40
19
Select * Even for :where exists (select *)Better where exists (select 1)
Select col5, where col5= ABBetter Select AB where col5= ABBest Select where col5= AB
Select col1, col2order by col2Better Select col1order by col2if just for order by
-
8/10/2019 Performance Tuning for Developers and DBA
20/40
20
Other easy improvements:hv between col1 and col2 col1 >= :hv and col2 0
COL :hv COL in ( , , , , , )
COL not 5
-
8/10/2019 Performance Tuning for Developers and DBA
21/40
21
Other easy improvementsSELECT DISTINCT COL1, COL2, COUNT(C1)
FROM TABLE
WHERE
Always results in extra SORT
SELECT COL1, COL2, COUNT(C1)
FROM TABLE
WHERE
GROUP BY COL1, COL2
Same results SORT can be avoided
V9
Before version 9, although logically alike, there was a clear difference between both,
queries.
Using a distinct would always result in an extra sort, whereas the second query, with
adequate indexing could avoid the sort.
For instance an index on COL1, COL2 would have avoided a sort in the second query.
Since version 9, the distinct clause can also be used to avoid an extra sort.
Another important change is that since V9 and index COL2, COL1 can also be used to
avoid an extra sort. That of course means that you could have an impact in the
sequence of your result set and an order by clause should be included if you want to
guarantee the V8 sequence.
-
8/10/2019 Performance Tuning for Developers and DBA
22/40
22
More easy improvementsCol1=A orCol1= B Col1 in (A,B)
Col1>= :hv1 and COL1= :hv1 AND
Col1 = :hv1 or (col1=:hv1 or
Col1 >:hv1 and Col2 = :hv2 col1>:hv1 and col2 =:hv2)
Col1 = :hva (always 5) Col1 = 5
:hv = 5 IN COBOL !!!
-
8/10/2019 Performance Tuning for Developers and DBA
23/40
23
Even More easy improvementsCol1 not between 10 and 50 col 1 < 10
union all
col1 > 50
Existence checking select 1
from table
where col1 =:hv
fetch first 1 row only
Col1 not in (A, B, C) if possible
Col1 in (the rest)
will be cheaper even
when list is bigger
-
8/10/2019 Performance Tuning for Developers and DBA
24/40
24
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
25/40
25
Determine Access Path Optimization Service Center
Newest generation of Visual explain
Plan_table See next slide
Might require some exercise
Not everything in it
DSN_statement_table Contains the Cost columns
-
8/10/2019 Performance Tuning for Developers and DBA
26/40
26
DB2 Plan_tableSELECT QBLOCKNO, PROGNAME, PLANNO, METHOD,
TNAME, ACCESSTYPE, MATCHCOLS, ACCESSNAME, I NDEXONLY, PREFETCH
FROM PLAN_TABLE WHERE QUERYNO = 30303
ORDER BY QBLOCKNO, PLANNO ;
QBLOCKNO PROGNAME PLANNO METHOD TNAME ACCESSTYPE MATCHCOLS ACCESSNAME I NDEXONLY
1 DSNESM68 1 0 AATEHA1 I 2 AAX0EHA1 N
1 DSNESM68 2 1 AATEHB1 I 2 AAX0EHB1 N
1 DSNESM68 3 3 0 N
Qblockno: indicates the number blocksnecessary to resolve the query
General rule, more blocks = less performing
Progname: represents the Program/packagename
-
8/10/2019 Performance Tuning for Developers and DBA
27/40
27
Access path: planno, method Planno: the number of steps AND thesequence in which a query is resolved
General rule, more steps = less performing
Method: expresses what kind of access is
done 0 : First access
1 : Nested Loop Join
3 : extra sort needed
Tname: table name to be accessed
Access type : how that data is accessed
-
8/10/2019 Performance Tuning for Developers and DBA
28/40
28
DSN_Statement_Table Amongst others :
COST_CATEGORY:
A: Indi cates that DB2 had enough i nfo rmatio n to make a cost esti mate withou t using
default values.
B: Indicates that some condition exists for which DB2 was forced to use default
values.
PROCMS:The estimated processor cost, in mil liseconds, for t he SQL statement
PROCSU:The estimated processor cost, in service units , for the SQL statement
-
8/10/2019 Performance Tuning for Developers and DBA
29/40
29
DSN_PREDICAT_TABLE Contains all predicates and how they are
used.
Extremely useful for index design
Replaces the old spreadsheet
technique
-
8/10/2019 Performance Tuning for Developers and DBA
30/40
30
Access Path Follow UpSpecificplan_tables
Identify
every
query
using
QUERYNO
New binds
plan_tables
Generalplan_tables
Transfer
LAN
EXCELL
Changes
plan_tablesEMAIL
Insert
It is also best to set up, an automated way of following up your access path changes.
And notifying your DBA and responsible developers.
-
8/10/2019 Performance Tuning for Developers and DBA
31/40
31
Agenda One SQL, one access path
Index, stage1, stage2
Sort impact
SQL examples of sub optimal coding and
its improvements
Access path fields in the plan_table
Other CPU saving techniques
-
8/10/2019 Performance Tuning for Developers and DBA
32/40
32
Multi Row fetch Technique to save up to 60% of DB2 cpu
Easy to use
Fetches a rowset into an array
Program can control size of rowset
!! due to compiler limits !!
elementary item : max. 16Mb
complete working storage : max 128 Mb
-
8/10/2019 Performance Tuning for Developers and DBA
33/40
33
Multi Row Fetch To be able to use this, the cursor should be
DECLAREd for rowset positioning, forexample:
EXEC SQLDECLARE cur sor - name CURSORWITH ROWSET POSITIONING FORSELECT col umn1
, col umn2 FROM t abl e- name;END- EXEC
instead ofEXEC SQL
DECLARE cur sor - name CURSOR FORSELECT col umn1
, col umn2 FROM t abl e- name;
END- EXEC
Then you can FETCH multiple rows at-a-timefrom the cursor
-
8/10/2019 Performance Tuning for Developers and DBA
34/40
34
Multi Row FetchOn the FETCH statement
the amount of rows requested can be specifiedfor example:
EXEC SQLFETCHNEXT ROWSET FROMcurs or- nameFOR :rowset-size ROWS
I NTO END- EXEC
instead ofEXEC SQL
FETCH curs or - nameI NTO
END- EXEC
The rowset size can be defined as a constant or avariable, for example:
01 rowset-size PIC S9(09) COMP-5.
-
8/10/2019 Performance Tuning for Developers and DBA
35/40
35
Multi Row fetch Do not use single and multiple row fetch
for the same cursor in one program
Be aware of compiler limits elementary item : max. 16Mb
complete working storage : max 128 Mb
Last FETCH on a rowset can be
incomplete
!! due to compiler limits !!
elementary item : max. 16Mb
complete working storage : max 128 Mb
-
8/10/2019 Performance Tuning for Developers and DBA
36/40
36
Multi Row Fetch Performance results may differ: < 5 rows : poor performance (worse than before)
10 100 rows : best performance
> 100 rows : no improvement anymore
Following data is based upon treatment of
1 million rows (in seconds CPU).
16 (-35%)6076FETCH + UPDATE via rowset
10 (-15%)6676FETCH + UPDATE via row
10 (-60%)616FETCH
Gain on DB2
in CPU seconds
Via rowsetVia row
Performance results may differ, depending on the amount of columns and
their data type, but mainly:< 5 rows : poor performance (worse than before)
10 100 rows : best performance
> 100 rows : no improvement anymore (same as 10 - 100 rows)
gain 10 seconds of CPU per one million rows when using rowset pointers
Following data is based upon treatment of 1 million rows (in seconds CPU).
-
8/10/2019 Performance Tuning for Developers and DBA
37/40
37
Sequences Easy, fast and cheap way to generate
unique numbers if : Holes are allowed
The order isnt important
Use : next value for yy.xxxxxxxx statement
BASIC SYNTAX : CREATE SEQUENCE yy.xxxxxxxx
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUENO CYCLE
CACHE 200;
-
8/10/2019 Performance Tuning for Developers and DBA
38/40
38
SequencesEffect of concu rrency on elapsedtime
0
2
4
6
8
1 2 3
amount jobs
duration
own table
seq object
Effect on cpu usage
0
20
40
60
80
100
120
1 2 3
amount jobs
cpu own table
seq object
Better response times
Less cpu need
-
8/10/2019 Performance Tuning for Developers and DBA
39/40
39
Questions ?
-
8/10/2019 Performance Tuning for Developers and DBA
40/40