ibm software group | db2 curriculum course 1 query performance and optimizer specific type of issues...

53
IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: [email protected]

Upload: pearl-heath

Post on 05-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Curriculum Course

1

Query Performance and Optimizer Specific Type of Issues

Speaker name: Anthony E. Reina

Email: [email protected]

Page 2: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

2

The IBM Software Accelerated Value Program delivers a proactive, cost-reducing, and productivity enhancing advisory service. The program pairs you with an assigned team who build a foundational understanding of your overall environment. Through that understanding, the trusted partnet works to facilitate faster deployment, lifecycle leadership, risk mitigation, and more by identifying ways to improve your environment, staff skill set, and processes.

http://www-01.ibm.com/software/support/acceleratedvalue/

IBM Software Accelerated Value Program

Page 3: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

3

High-Level OverviewOptimizer purpose: Chooses access plan for data manipulation language (DML) SQL statements.

Means of fulfilling the purpose: Calculates the execution cost of many alternative access plans, and then selects the

one with the minimal estimated cost. Bases calculations on catalog statistics. Measures cost in timerons:

An abstract unit of measure. Not directly reciprocal to actual elapsed time. Rough relative estimate of the resources (cost) required by the database

manager to execute an access plan. Access plan cost is cumulative: each operator cost is added to the next one.

Page 4: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

4

High-Level OverviewFactors influencing the optimizer:

Database design Indexes Tablespace configuration

Table and index statistics Basis of plan costs Generated and collected by the RUNSTATS command

Operating environment Influences plan costs Various configuration parameters

Optimization class (OPTLEVEL) Controls the level of modeling considered during compilation Influences SQL compilation time

Page 5: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

5

DB2 OPTIMIZER CODEPATH

Page 6: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

6

DB2 OPTIMIZER CODEPATH Parse Query: analyzes the query to validate the syntax.

Check Semantics: ensures that there are no inconsistencies among parts of the statement.

Rewrite Query: uses the global semantics that are stored in the Query Graph Model (QGM) to transform the query into a form that can be optimized more easily. The result is stored in the QGM, as well.

Pushdown Analysis (Federated DB only): recommends to the optimizer whether an operation can be remotely evaluated or pushed down at a data source.

Page 7: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

7

DB2 OPTIMIZER CODEPATH Optimize Access Plan: using the QGM as input, it generates

many alternative access plans, at the end it select the most cost effective plan for satisfying the query.

Remote SQL Generation (Federated DB only): creates an efficient SQL statement for operations that are performed by each remote data source based on the SQL dialect at that data source.

Generate Executable Code: uses the access plan and the QGM to create an executable access plan or section for the query.

Page 8: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

8

DB2EXFMT TOOLExplain purpose: Captures information about the access plan, optimizer inputs,

and environment of an SQL query.

Helps to understand how the SQL query is compiled and executed.

Shows how configuration parameter changes impact query performance.

Indispensible tool for query problem determination.

Page 9: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

9

DB2EXFMT TOOL

Explain information includes:

Sequence of operations to process the query Cost information Predicates and selectivity estimates for each predicate Statistics for all objects referenced in the SQL query at the time

the explain information is captured

Page 10: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

10

DB2EXFMT TOOL

Explain tables:

Must be created before the Explain can be invoked:

SYSINTSALLOBJECTS() procedure call sysproc.sysinstallobjects (‘EXPLAIN’,’C’,NULL,<schema name>)

EXPLAIN.DDL

db2 –tvf …/sqllib/misc/EXPLAIN.DDL

Must be created per schema/user or under SYSTOOLS schema

Capture access plan data when the Explain facility is activated

Page 11: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

11

DB2EXFMT TOOL

Explain tables:ADVICE_INDEXADVICE_INSTANCEADVICE_MQTADVICE_PARTITIONADVICE_TABLEADVICE_WORKLOADEXPLAIN_ACTUALS

EXPLAIN_ARGUMENTEXPLAIN_INSTANCEEXPLAIN_OBJECTEXPLAIN_OPERATOREXPLAIN_PREDICATEEXPLAIN_STATEMENTEXPLAIN_STREAM

Page 12: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

12

DB2EXFMT TOOLExplain tools:

db2expln Provides access plan information for one or more SQL query

packages Shows the actual implementation of the chosen access plan Does not show detailed optimizer information

db2exfmt Formats the contents of the explain tables Provides more detailed explain information than db2expln Shows clues as to why the optimizer has made particular

decisions

Page 13: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

13

DB2EXFMT TOOLdb2exfmt output sections:

1.Explain instance

2.Database context

3.Package context

4.Original statement

5.Optimized statement

6.Access plan

7.Extended diagnostic information

8.Plan details

9.Objects used in the access plan

Page 14: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

14

DB2EXFMT TOOLdb2exfmt output sections:

Explain instance: general information (e.g., version, time, schema, etc).

Database context: configuration parameters used during the query compilation.

Package context: query type and optimization level. Original statement: actual query text. Optimized statement: rewritten query text according to the

applied optimization techniques.

Page 15: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

15

DB2EXFMT TOOLExplain output sections (continued):

Access plan: graphical representation of the query execution; displays the order of operations along with rows returned, cost and I/O for each operator.

Extended diagnostic information: warnings, informational or error messages returned by the optimizer.

Plan details: detailed information for each operator including arguments, predicates, estimates (filter factor), and input/output streams.

Objects used in the access plan: any objects (table/index) used by the query, and/or the access plan is displayed on this are which includes statistical information.

Page 16: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

16

DB2EXFMT TOOL>>-db2exfmt--+-----+--+-------------+--+-------------+----------> '- -1-' '- -d--dbname-' '- -e--schema-' >--+--------+--+---------------------------+--+-----+-----------> '- -f--O-' | .-----------. | '- -l-' | V | | '- -g--+---+----+-------+-+-' '-x-' +-O-----+ +-I-----+ +-C-----+ '-+-T-+-' '-F-' .- -t-. >--+-----------+--+--------------+--+-------------+--+-----+----> '- -n--name-' '- -o--outfile-' '- -s--schema-' >--+-----------------------+--+----------------+----------------> '- -u--userID--password-' '- -w--timestamp-' >--+--------------+--+--------------+--+-----+----------------->< '- -#--sectnbr-' '- -v--srcvers-' '- -h-'

Page 17: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

17

DB2EXFMT TOOLSET CURRENT EXPLAIN MODE statement

• Changes the value of the CURRENT EXPLAIN MODE special register

• Compiler returns SQL0217W for a successful compilation of SQL run under the EXPLAIN mode

• Syntax:>>-SET CURRENT EXPLAIN MODE -+---+------------------------------> >--+-NO----------------------+--------------------------------->< +-YES---------------------+ +-EXPLAIN-----------------+ +-REOPT-------------------+ +-RECOMMEND INDEXES-------+ +-EVALUATE INDEXES--------+ +-RECOMMEND PARTITIONINGS-+ +-EVALUATE PARTITIONINGS--+ '-host-variable-----------'

Page 18: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

18

DB2 Cfg’s and Env. Registry Settings Tuning is usually done by making changes to one of the

configuration categories that influence the optimizer behaviour:

Database manager configuration

Database configuration

Query compiler registry variables

CURRENT QUERY OPTIMIZATION special register

Page 19: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

19

DB2 Cfg’s and Env. Registry SettingsDatabase manager configuration:

Parallelism (INTRA_PARALLEL): indicates whether intra-partition parallelism is enabled. When YES, some parts of query execution (e.g., index sort) can run in parallel within a single database partition.

CPU Speed (CPUSPEED): the optimizer uses the CPU speed (milliseconds per instruction) to estimate the cost of performing certain operations.

Communications speed (COMM_BANDWIDTH): the optimizer uses the value (megabytes per second) in this parameter to estimate the cost of performing certain operations between the database partition servers in a partitioned database environment.

Page 20: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

20

DB2 Cfg’s and Env. Registry SettingsDatabase configuration:

Buffer pools (BUFFPAGE): Determined by the buffpage parameter, if using buffpage as default

for one bufferpool, or a calculation based on the contents of syscat. bufferpools.

The number shown is the total number of bufferpool pages allocated for the database. The table below shows an example of how we arrive at the total of 5000 pages:

Bufferpool name Size

IBMDEAFULTBP 1000

BPLONG 4000

BPTEMP 1000

Total 5000

Page 21: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

21

DB2 Cfg’s and Env. Registry SettingsDatabase configuration (continued):

Sort heap size (SORTHEAP): Defines the maximum number of private memory pages to be used for private sorts or the maximum number of shared memory pages to be used for shared sorts.

Database heap size (DBHEAP): Contains metadata for tables, indexes, table spaces and bufferpools. There is one database heap per database.

Lock list size (LOCKLIST): Indicates the amount of storage allocated to the lock list. There is one lock list per database and it contains the locks held by all applications concurrently connected to the database.

Maximum lock list (MAXLOCKS): Defines a percentage of the lock list held by any one application that must be filled before the database manager performs lock escalation.

Page 22: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

22

DB2 Cfg’s and Env. Registry SettingsDatabase configuration (continued):

Average applications (AVG_APPLS): Used by the SQL optimizer to help estimate how much bufferpool will be available at run-time for the access plan chosen (since the bufferpool is shared by all active connections).

Optimization class (DFT_QUERYOPT): Tells the optimizer which level of optimization to use when compiling SQL query (default=3). The higher the level, the more complex algorithms are considered.

Query degree (DFT_DEGREE): Represents the degree of intra-partition parallelism for an SQL statement. If set to ANY, the optimizer is sensitive to the actual number of CPUs that are online. If set to ANY and intra_parallel is enabled, then the number of CPUs on test and production should be configured the same.

Page 23: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

23

DB2 Cfg’s and Env. Registry SettingsDatabase configuration (continued):

Number of frequent values retained (NUM_FREQVALUES): Allows you to specify the number of "most frequent values" that will be collected when the WITH DISTRIBUTION option is specified with the RUNSTATS command.

Number of quantiles retained (NUM_QUANTILES): Controls the number of quantiles that will be collected when the WITH DISTRIBUTION option is specified on the RUNSTATS command.

Page 24: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

24

DB2 Cfg’s and Env. Registry SettingsQuery compiler variables (common):

DB2_EXTENDED_OPTIMIZATION: Specifies whether or not the query optimizer uses optimization extensions to improve query performance.

DB2_SORT_AFTER_TQ: Determines how the optimizer works with directed table queues in a partitioned database environment when the receiving end requires the data to be sorted, and the number of receiving nodes is equal to the number of sending nodes.

DB2_INLIST_TO_NLJN: Causes the optimizer to favour nested loop joins to join the list of values, using the table that contributes the IN list as the inner table in the join.

* See the DB2 V9.7 Info Center for additional variables(search for “Query compiler variables”)

Page 25: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

25

DB2 Cfg’s and Env. Registry SettingsCURRENT QUERY OPTIMIZATION special register: Controls the query optimization class for dynamic SQL statements. Overrides the dft_queryopt database configuration parameter. Affects the succeeding SQL statements only for the current session. Assigned by the SET CURRENT QUERY OPTIMIZATION statement.

>>-SET--CURRENT--QUERY--OPTIMIZATION--+---+----------------->

>--+-0-------------+--------------------------------------->< +-1-------------+ +-2-------------+ +-3-------------+ +-5-------------+ +-7-------------+ +-9-------------+ '-host-variable-'

Page 26: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

26

CATALOG STATISTICS The optimizer is heavily influenced by statistical information

about the size of tables, indexes and statistical views.

The optimizer also uses statistics about the distribution of data in columns of tables, indexes and statistical views, if these columns are used to select rows or to join tables.

The optimizer uses distribution statistics in its calculations to estimate the costs of alternative access plans for each query.

Statistical information is stored in system catalog tables and is accessible via the SYSCAT schema views.

Page 27: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

27

CATALOG STATISTICS Statistical information used by the optimizer:

Cluster ratio of indexes (higher is better)Number of leaf pages in indexesNumber of table rows that overflow their original pagesNumber of filled and empty pages in a table (may indicate

if table or index reorganization is needed)

Catalog statistics are collected and updated by the RUNSTATS utility.

Page 28: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

28

CATALOG STATISTICSDB2LOOK utility:

Mimic option (-m) extracts catalog statistics for all or given tables.

Helps recreate optimizer behaviour in another database. DB2LOOK for a specific table:

db2look –d <dbname> -a –e –m –z <schema> -t <table1 table2 …> -o <outfile>

DB2LOOK for all tables in the database:db2look –d <dbname> -f –m –l –a –e –o <outfile>

Alternative method: query the catalog tables directly using SYSCAT schema views.

Page 29: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

29

CATALOG STATISTICSCatalog views

Value of -1 in the columns indicates no statistics collected. SYSCAT views are read-only; SYSSTAT views are updateable.

Catalog views Description

SYSCAT.TABLESSYSSTAT.TABLES

Table statistics

SYSCAT.COLUMNSSYSSTAT.COLUMNS

Column statistics

SYSCAT.COLGROUPSSYSSTAT.COLGROUPS

Multicolumn statistics

SYSCAT.COLGROUPDISTCOUNTSSYSSTAT.COLGROUPDISTCOUNTS

Multicolumn distribution statistics

Page 30: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

30

RUNSTATSRUNSTATS utility:

Updates statistics about the characteristics of a table and/or associated indexes, or statistical views.

For a table, should be run after frequent updates / inserts / deletes are made to the table or after reorganizing the table.

For a statistical view, RUNSTATS should be run when changes to underlying tables have substantially affected the rows returned by the view.

Page 31: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

31

RUNSTATSRUNSTATS best practices:

Use RUNSTATS on all tables and indexes frequently accessed by queries.

Use RUNSTATS with FREQUENT and QUANTILE value statistics for complex queries (only collected when using WITH DISTRIBUTION clause).

Rerun RUNSTATS after a significant change in the table data.

Page 32: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

32

RUNSTATSRUNSTATS best practices (continued):

Use RUNSTATS in the following situations:

After loading data and indexes have been created. After creating new index on a table. After running a REORG utility. After significant insert, update or delete activity. Before binding application programs. After executing REDISTRIBUTE DB PARTITION GROUP

command.

Page 33: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

33

RUNSTATSRUNSTATS examples (basic statistics):

Collect statistics for table only:

RUNSTATS ON TABLE <schema>.<tablename>

Collect statistics for indexes only on a given table:

RUNSTATS ON TABLE <schema>.<tablename> FOR INDEXES ALL

Collect statistics for both table and its indexes:

RUNSTATS ON TABLE <schema>.<tablename> AND INDEXES ALL

Page 34: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

34

RUNSTATSRUNSTATS examples (enhanced statistics):

Collect statistics for table including distribution statistics:

RUNSTATS ON TABLE <schema>.<tablename> WITH DISTRIBUTION

Collect statistics for indexes only on a given table including extended index statistics:

RUNSTATS ON TABLE <schema>.<tablename> FOR DETAILED INDEXES ALL

Collect statistics for table and its indexes including extended index and distribution statistics:

RUNSTATS ON TABLE <schema>.<tablename> WITH DISTRIBUTION AND DETAILED INDEXES ALL

Page 35: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality Estimates Cardinality (CARD)

Total number of rows in a table. The number of unique values in a column.

Filter factor (FF) The percentage of rows estimated by the optimizer to be

allowed to filter through a given predicate. Expressed as probability: 0 – 1.0 (0 to 100%).

Cardinality estimate The number of rows estimated by the optimizer to be

returned for a given predicate based on its filter factor. Calculated as Filter Factor x Cardinality.

35

Page 36: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality Estimates Filtering can occur with a local or join predicate

Movie_title = ‘RANGO’ (local predicate) a.c1 = b.c2 (join predicate)

COLCARD Number of distinct values in the column SYSCAT.COLUMNS

VALCOUNT Frequency with which the data value occurs in column SYSCAT.COLDIST

36

Page 37: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality Estimates

Calculating FF for equality predicates (c1=value)

No statistics available/collected FF = default value which is typically ¼ FF = 0.04 / 0.25

Only basic statistics collected FF = 1/COLCARD COLCARD is for the column of the predicate being calculated

With basic and distribution statistics collected FF = VALCOUNT / CARD

37

Page 38: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 1 - No statistics collected:

Query: SELECT movie_id

FROM movie

WHERE movie_genre=‘A’

Statistics: SYSSTAT.TABLE CARD = -1 SYSSTAT.COLUMNS COLCARD = -1 SYSSTAT.COLDIST no data

Filter Factor: 0.04

38

Page 39: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 1 - No statistics collected:

2) TBSCAN: (Table Scan)

Predicates:----------2) Sargable Predicate, Comparison Operator:

Equal (=) Subquery Input Required: No Filter Factor:

0.04

Predicate Text: -------------- (Q1.MOVIE_GENRE = 'A')

39

Page 40: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 2 - Only basic statistics collected:

Query: SELECT movie_idFROM movieWHERE movie_genre=‘A’

Statistics: RUNSTATS ON TABLE example.movie SYSSTAT.TABLE CARD = 146 SYSSTAT.COLUMNS COLCARD = 6 (for movie_genre col) SYSSTAT.COLDIST no data

Filter Factor: 1/COLCARD = 1/6 = 0.166667

40

Page 41: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 2 - Only basic statistics collected:

2) TBSCAN: (Table Scan)

Predicates:----------2) Sargable Predicate, Comparison Operator:

Equal (=) Subquery Input Required: No Filter Factor:

0.166667

Predicate Text: -------------- (Q1.MOVIE_GENRE = 'A')

41

Page 42: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 3 - Basic and distribution statistics collected:

Query: SELECT movie_idFROM movieWHERE movie_genre=‘A’

Statistics: RUNSTATS ON TABLE example.movie WITH DISTRIBUTION SYSSTAT.TABLE CARD = 146 SYSSTAT.COLUMNS COLCARD = 6 (for movie_genre col) SYSSTAT.COLDIST VALCOUNT = 72 (movie_genre=‘A’)

Filter Factor: VALCOUNT / CARD = 72/146 = 0.493151

42

Page 43: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

Cardinality EstimatesExample 3 - Basic and distribution statistics collected:

2) TBSCAN: (Table Scan)

Predicates:----------2) Sargable Predicate, Comparison Operator:

Equal (=) Subquery Input Required: No Filter Factor:

0.493151

Predicate Text: -------------- (Q1.MOVIE_GENRE = 'A')

43

Page 44: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

44

Statistical Views

Views with statistics that can be used to improve cardinality estimates for queries in which the view definition overlaps with the query definition.

A powerful feature as it provides the optimizer with accurate statistics for determining cardinality estimates for queries with complex sets of (possibly correlated) predicates involving one or more tables.

Page 45: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

45

Statistical Views Mechanism to provide the optimizer with more sophisticated

statistics that represent more complex relationships:Comparisons involving expressions:

price > MSRP + Dealer_markupRelationships spanning multiple tables:

product.name = 'Alloy wheels' and product.key = sales.product_key

Anything other than predicates involving independent attributes and simple comparison operations.

Page 46: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

46

Statistical Views Statistical views are able to represent these types of complex

relationships, because statistics are collected on the result set returned by the view, rather than the base tables reference by the view.

When a SQL query is compiled, the optimizer matches the SQL query to the available statistical views. When the optimizer computes cardinality estimates for intermediate result sets, it uses the statistics from the view to compute a better estimate.

Page 47: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

47

Statistical ViewsUsage:

Create a view:

CREATE VIEW <view name> AS (<select clause>) Enable the view for optimization using ALTER VIEW statement:

ALTER VIEW <view name> ENABLE QUERY OPTIMIZATION Issue the RUNSTATS command to populate system catalog

tables with statistics for the view:

RUNSTATS ON <schema>.<view name> <runstats clause>

Page 48: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

48

Optimizer Specific IssuesSQL0437W rc=1 Cause : This typically happens when there is not enough statement heap (STMTHEAP) available to complete full

dynamic join enumeration plan for the query.

Resolving the problem : This message does not necessarily mean the query will run nor there is a problem, but at the sametime

it does indicate that the query is potentially using a sub-optimal access plan.

In detail, the following are way on how to resolve it.1. If the execution of the query at the sametime is bad (when this warning message is returned) then increasing

the STMTHEAP would be the ideal solution, take note this may increase compile time.

2. If the execution time is not the issue but the warning message imposes a problem at application end, application change maybe required to avoid/ignore the warning message.

3. Binding the package with the ignore warning option can also be another solution. For example:

db2 bind <package name> blocking all grant public SQLWARN NO

4. If this is a cli/odbc application, using the IgnoreWarnList keyword in the db2cli.ini file can be used. For example :

IgnoreWarnList = “sqlstate1”, “sqlstate2”,….”

The sqlstate corresponding to SQL0437 is 01602. Hence: IgnoreWarnList=“01602”

Page 49: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

49

Optimizer Specific IssuesQuery resulting to trap/abnormal interruption Cause : For query executions resulting to trap the cause can range from a defect within the DB2 code to some limitation within DB2 or OS.

Resolving the problem : To begin resolving the issue, collecting the “db2support with optimizer data”, trap file, and identifying the condition for the trap to occur (if possible) would be the first action to take.

Succeeding actions can be take to determine the root cause and resolution of the issue…1. If possible recreate the issue either on the same environment or on a test, using a mimic database

or using the db2look data.

2. Review the db2diag.log entries and trap file generated relating to issue. This will provide information at which OPTIMIZER codepath the problem occurred.

3. Review the current fixpak listing from the IBM support fixpak site. Including IBM support knowledge base websites. To check for known issues.

4. If this is a known issue, either apply the workaround and/or the fixpak.

Page 50: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

50

Optimizer Specific IssuesQuery Performance/Incorrect results Cause : Query performance issue can occur from various reasons, some most common are as follows…

1. Defect from within the DB2 code.

2. Changes have occurred within DB2 configurations, DB2 environment variables, query, statistics collection, or data load. After a major release/change/maintenance window.

3. Data load/operation sequence have changed significantly.

4. Most times its cause by unexpected change within the environment.

5. Unknown issue which would require assistance from IBM DB2 support site.

6. Problem could either be during query compilation or runtime.

Resolving the problem :1. Determine the conditions that exist when the problem occurs.

2. Review the current fixpak listing from the IBM support fixpak site. Including IBM support knowledge base websites. To check for known issues.

4. If this is a known issue, either apply the workaround and/or the fixpak.

5. Compare previous OPTIMIZER related DB2 setting including access plans, to determine if there is a unexpected/accidental change within the environment.

6. Review the access plan and check if the statistics are up to date, including if the right options are used for the RUNSTATS command.

7. Use db2batch command to determine if performance issue is within query compilation or runtime.

8. For query performance issue caused by query compilation, try lowering OPTLEVEL as a workaround.

Page 51: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

51

Optimizer Specific IssuesOther Optimizer Specific IssuesCause : For other optimizer specific issue this includes…incorrect access plan usage, reorg, runstats, db2exfmt, etc…

Resolving the problem :1. Determine the conditions that exist when the problem occurs.

2. Review the current fixpak listing from the IBM support fixpak site. Including IBM support knowledge base websites. To check for known issues.

3. Collect the “db2support with optimizer data”. Usefull for trying to recreate the problem.

4. If this is a known issue, either apply the workaround and/or the fixpak.

5. Determine the conditions that exist when the problem occurs.

6. Take note of the sequence of operations when the problem occurs.

Page 52: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

52

Optimizer Specific IssuesStandard Data Collection for Optimizer Issues

Use as a standard data collection.

Can be used to recreate OPTIMIZER issues and for deep level PD/PSI.

Usefull for keeping historical information.

Link : http://www-01.ibm.com/support/docview.wss?rs=71&uid=swg21106550

Page 53: IBM Software Group | DB2 Curriculum Course 1 Query Performance and Optimizer Specific Type of Issues Speaker name: Anthony E. Reina Email: aereina@ca.ibm.com

IBM Software Group | DB2 Information Management Software | Curriculum Course

53

Usefull Links Info Center : Predicate Filter Factor

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z10.doc.perf%2Fsrc%2Ftpc%2Fdb2z_predicatefilterfactor.htm

Info Center : Scenario : Improving cardinality estimates using Statistical Views

http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=%2Fcom.ibm.db2z10.doc.perf%2Fsrc%2Ftpc%2Fdb2z_predicatefilterfactor.htm

IBM Technical – Developer Works : Comparing real-time cardinality to the optimizer cardinality estimates

http://www.ibm.com/developerworks/data/library/techarticle/dm-0512kapoor