oracle perf tune - intro
Post on 03-Jun-2018
225 Views
Preview:
TRANSCRIPT
-
8/12/2019 Oracle Perf Tune - Intro
1/16
Oracle Data Architecture
Introduction to Performance Tuning
Performance tuning focuses primarily on writing efficient SQL, allocating appropriate computing
Resources, and analyzing wait events and contention in the system. Unlike several other
features of Oracle database management, performance tuning isnt a cut and
dried subject with clear prescriptions and rules for every type of problem you may face. This is
one area where your technical knowledge must be used together with constant experimentation
and observation
A Systematic Approach to Performance Tuning
Its important to follow a systematic approach to tuning database performance. Performance
problems commonly come to the fore only after a large number of users start working on a new
-
8/12/2019 Oracle Perf Tune - Intro
2/16
production database. The system seems fine during development and rigorous testing, but it
slows down to a crawl when it goes to production. This could be because the application isnt
easily scalable for a number of reasons.
The seeds of the future performance potential of your database are planted when you design
your database. You need to know the nature of the applications the database is going to
support.
The more you understand your application, the better you can prepare for it by creating your
database
with the right configuration parameters. If major mistakes were made during the design stage
and the database is already built, youre left with tuning application code on one hand and the
database resources such as memory, CPU, and I/O on the other. Oracle suggests a specific
design approach with the following steps:
1. Design the application correctly.
2. Tune the application SQL code.
3. Tune memory.
4. Tune I/O.
5. Tune contention and other issues.
Reactive & Reactive Performance Tuning
Although the preceding performance tuning steps suggests that you can follow the sequence in
an orderly fashion, the reality is completely different. Performance tuning is an iterative process,
not a sequential one where you start at the top and end up with a fully tuned database as the
product.
As a DBA, you may be involved in a new project from the outset, when you have just the
functional requirements. In this case, you have an opportunity to be involved in the tuning effort
from the beginning stages of the application, a phase that is somewhat misleadingly dubbed
proactive tuning by some. Alternatively, you may come in after the application has already been
designed and implemented, and is in production. In this case, your performance efforts are
categorized as reactive performance tuning. What you can do to improve the performance of the
-
8/12/2019 Oracle Perf Tune - Intro
3/16
database depends on the stage at which you can have input, and on the nature of the
application itself.
Optimizing Oracle Query Processing
When a user starts a data-retrieval operation, the users SQL statement goes through several
sequential steps that together constitute query processing. One of the great benefits of using
the SQL language is that it isnt a procedural language in which you have to specify the steps to
be followed to achieve the statements goal. In other words, you dont have to state how to do
something; rather, you just state what you need from the database.
Query processing is the transformation of your SQL statement into an efficient execution plan
to return the requested data from the database. Query optimization is the process of choosing
the most efficient execution plan. The goal is to achieve the result with the least cost in terms of
resource usage. Resources include the I/O and CPU usage on the server where your database
is running. This also means that the goal is to reduce the total execution time of the query,
which is simply the sum of the execution times of all the component operations of the query.
This optimization of throughput may not be the same as minimizing response time. If you want
to minimize the time it takes to get the first n rows of a query instead of the entire output of the
query, the Optimizer may choose a different plan. If you choose to minimize the response time
for all the query data, you may also choose to parallelize the operation.
A users SQL statement goes through theparsing, optimizing, and execution stages. If the
SQL statement is a query, data has to be retrieved, so theres an additional fetch stage before
the SQL statement processing is complete.
Parsing
Parsing primarily consists of checking the syntax and semantics of the SQL statements. The
end product of the parse stage of query compilation is the creation of theparse tree, which
represents the querys structure.
The SQL statement is decomposed into a relational algebra query thats analyzed to see
Whether its syntactically correct. The query then undergoes semantic checking. Thedata
dictionary is consulted to ensure that the tables and the individual columns that are referenced
-
8/12/2019 Oracle Perf Tune - Intro
4/16
in the query do exist, as well as all the object privileges. In addition, the column types are
checked to ensure that the data matches the column definitions. The statement is normalized so
it can be processed more efficiently. The query is rejected if it is incorrectly formulated. Once
the parse tree passes all the syntactic and semantic checks, its considered a valid parse tree,
and its sent to the logical query plan generation stage. All these operations take place in the
library cache portion of the SGA.
Optimization
During the optimization phase, Oracle uses its optimizerwhich is a cost-based optimizer
(CBO)to choose the best access method for retrieving data for the tables and indexes
referred to in the query. Using statistics that you provide and any hints specified in the SQL
queries, the CBO produces an optimal execution plan for the SQL statement.
Execution Plan Generation Phase
During this phase, Oracle transforms the logical query plan into a physical query plan. The
Optimizer may be faced with a choice of several algorithms to resolve a query. It needs to
choose the most efficient algorithm to answer a query, and it needs to determine the most
efficient way to implement the operations. In addition to deciding on the best operational steps,
the Optimizer determines the order in which it will perform these steps. For example, the
Optimizer may decide that a join between table A and table B is called for. It then needs to
decide on the type of join and the order in which it will perform the table join.
The physical query or execution plan takes into account the following factors:
The various operations (for example, joins) to be performed during the query
The order in which the operations are performed
The algorithm to be used for performing each operation
The best way to retrieve data from disk or memory
The Optimizer may generate several valid physical query plans, all of which are potential
execution plans. The Optimizer then chooses among them by estimating the cost of each
possible physical plan based on the table and index statistics available to it, and selecting the
plan with the lowest estimated cost. This evaluation of the possible physical query plans is
called cost-based query optimization. The cost of executing a plan is directly proportional to the
amount of resources such as I/O, memory, and CPU necessary to execute the proposed plan.
The Optimizer passes this lowcost physical query plan to Oracles query execution engine.
-
8/12/2019 Oracle Perf Tune - Intro
5/16
An Example of Cost Optimization
Lets say you want to run the following query, which seeks to find all the supervisors who work
in Dallas. The query looks like this:
Now, you have several ways to arrive at the list of the supervisors. Lets consider three ways to
arrive at this list, and compute the cost of accessing the results in each of the three ways.
Make the following simplifying assumptions for the cost computations:
You can only read and write data one row at a time (in the real world, you do I/O at the block
level, not the row level).
The database writes each intermediate step to disk (again, this may not be the case in the
real world).
No indexes are on the tables.
The employee table has 2,000 rows.
The dept table has 40 rows. The number of supervisors is also 40 (one for each department).
Ten departments are in the city of Dallas.
In the following sections, youll see three different queries that retrieve the same data, but tha t
use different access methods. For each query, a crude cost is calculated, so you can compare
how the three queries stack up in terms of resource cost. The first query uses a Cartesian join.
Query 2: A Join of Two Tables
The second query uses a join of the employee and dept tables. First, join the employee and
dept tables on the dept_no column. From this join, select all rows where e.job=supervisor and
city=Dallas.
The following would be the total cost of performing the query:
Joining the employee and dept tables first requires a read of all the rows in both tables:
2,000 + 40 = 2,040
-
8/12/2019 Oracle Perf Tune - Intro
6/16
Creating the join of the employee and dept tables: 2,000 writes
Reading the join results costs: 2,000 reads
Total I/O cost: 2,040 + 2,000 + 2,000 = 6,040
Query 3: A Join of Reduced Relations
The third query also uses a join of the employee and dept tables, but not all the rows in the two
tablesonly selected rows from the two tables are joined. Heres how this query would proceed
to retrieve the needed data. First, read the employee table to get all supervisor rows. Next, read
the dept table to get all Dallas departments. Finally, join the rows you derived from the
employee and the dept tables.
The following would be the total cost of performing the query:
Reading the employee table to get the supervisor rows: 2,000 reads
Writing the supervisor rows derived in the previous step: 40 writes
Reading the dept table to get all Dallas departments: 40 reads
Writing the Dallas department rows derived from the previous step: 10 writes
Joining the supervisor rows and department rows derived in the previous steps of this query
execution: A total of 40 + 10 = 50 writes
Reading the join result from the previous step: 50 reads
Total I/O cost: 2,000 + 2(40) + 10 + 2(50) = 2,190
This example, simplified as it may be, shows you that Cartesian products are more expensive
than more restrictive joins. Even a selective join operation, the results show, is more expensive
than a selection operation. Although a join operation is in query 3, itsa join of two reduced
relations; the size of the join is much smaller than the join in query 2. Query optimization often
involves early selection (picking only some rows) and projection (picking only some columns)
operations to reduce the size of the resulting outputs or row sources.
Query Execution
During the final stage of query processing, the optimized query (the physical query plan that has
been selected) is executed. If its a SELECT statement, the rows are returned to the user. If its
an INSERT, UPDATE, or DELETE statement, the rows are modified. The SQL execution engine
takes the execution plan provided by the optimization phase and executes it.
Of the three steps involved in SQL statement processing, the optimization process is the
-
8/12/2019 Oracle Perf Tune - Intro
7/16
crucial one because it determines the all-important question of how fast your data will be
retrieved.
Understanding how the Optimizer works is at the heart of query optimization. Its important to
know what the common access methods, join methods, and join orders are in order to write
efficient SQL. The next section presents a detailed discussion of the all-powerful Oracle CBO.
Query Optimization and the Oracle
Cost-Based Optimizer
In most cases, you have multiple ways to execute a SQL query. You can get the same results
from doing a full table scan or using an index. You can also retrieve the same data by accessing
the tables and indexes in a different order. The job of the Optimizer is to find the optimal or best
plan to execute your DML statements such as SELECT, INSERT, UPDATE, and DELETE.
Oracle uses the CBO to help determine efficient methods to execute queries.
The CBO uses statistics on tables and indexes, the order of tables and columns in the SQL
statements, available indexes, and any user-supplied access hints to pick the most efficient way
to access them. The most efficient way, according to the CBO, is the least costly access
method, cost being defined in terms of the I/O and the CPU expended in retrieving the rows.
Accessing the necessary rows means Oracle reads the database blocks on the file system into
the buffer pool. The resulting I/O cost is the most expensive part of SQL statement execution
because it involves reading from the disk. You can examine these access paths by using tools
such as the EXPLAIN PLAN.
Choosing Your Optimization Mode
In older versions of Oracle, you had a choice between a rule-based and a cost-based optimizer.
In a rule-based approach. All the access paths were assigned a rank, and the path with the
lowest rank was chosen. The operations with a lower rank usually executed faster than those
with a higher rank. For example, a query that uses the ROWID to search for a row has a cost of
1. This is expected because identifying a row with the help of the ROWID, an Oracle pointer-like
mechanism, is the fastest way to locate a row. On the other hand, a query that uses a full table
scan has a cost of 19, the highest possible cost under rule-based optimization. The CBO
method almost always performs better than the older rule-based approach because, among
other things, it takes into account the latest statistics about the database objects.
Providing Statistics to the Optimizer
-
8/12/2019 Oracle Perf Tune - Intro
8/16
In Oracle Database 10g, the database itself automatically collects the necessary Optimizer
statistics. Every night, the database schedules a statistics collection job during the maintenance
window of the Oracle Scheduler. The maintenance window, by default, extends from 10 PM to 6
AM on weekdays and all weekend as well. The job is named GATHER_STATS_JOB, and runs
by default in every Oracle Database 10g database. You have the ability to disable the
GATHER_STATS_JOB if you wish.
You can get details about this default GATHER_STATS_JOB by querying the
DBA_SCHEDULER_JOBS view.
The GATHER_STATS_JOB collects statistics for all tables that either dont have Optimizer
statistics, or have stale (outdated) statistics. Oracle considers an objects statistics stale if more
than 10 percent of its data has changed since the last time it collected statistics for that object.
By default, Oracle monitors all DML changes such as inserts, updates, and deletes made to all
database objects. You can also view the information about these changes in the
DBA_TAB_MODIFICATIONS view. Based on this default object monitoring, Oracle decides
whether to collect new statistics for an object.
To check that the GATHER_STATS_JOB is indeed collecting statistics on a regular basis, use
the following:
Note the following points about the preceding output:
-
8/12/2019 Oracle Perf Tune - Intro
9/16
The job collects statistics during the maintenance window of the database, which is, by
default, scheduled between 10 PM and 6 AM during weekdays and all weekend.
The statistics are collected by the nightly GATHER_STATS_JOB run by the Scheduler.
If a table is created that day, the job uses all the rows of the table the first time it collects
statistics for the table.
The sampling percentage varies from less than 1 percent to 100 percent.
The size of the table and the percentage of the sample arent correlated.
The job doesnt collect statistics for all the tables each day.
If a tables data doesnt change after its created, the job never collects a second time.
Oracle determines the sample size for each object based on its internal algorithms; there is no
Standard sample size for all objects.
What exactly the statistics does
The number of rows in a table
The number of rows per database block
The average row length
The total number of database blocks in a table
The number of levels in each index
The number of leaf blocks in each index
The number of distinct values in each column of a table
Data distribution histograms
The number of distinct index keys
Cardinality (the number of columns with similar values for each column)
The minimum and maximum values for each column
System statistics, which include I/O characteristics of your system; and CPU statistics, which
include CPU speed and other related statistics
The key to the CBOs capability to pick the best possible query plan is its capability to correctly
estimate the cost of the individual operations of the query plan. These cost estimates are
derived from the knowledge about the I/O, CPU, and memory resources needed to execute
each operation based on the table and index statistics.
The database stores the Optimizer statistics that it collects in its data dictionary. The DBA_
-
8/12/2019 Oracle Perf Tune - Intro
10/16
TAB_STATISTICS table shows Optimizer statistics for all the tables in your database. You can
also see column statistics by querying the DBA_TAB_COL_STATISTICS view
As you can see, more than 22 million PERSON_ID numbers are in the PERSONNEL table.
However, there are only 7,281 distinct last names and 1,729 distinct first names. Of course, the
GENDER column has only two distinct values. The Optimizer takes into account these types of
information regarding your table data, before deciding on the best plan of execution for a SQL
statement that involves the tables columns.
Tip Optimizer statistics include both object (table and index) statistics, as well as system
statistics. Without
accurate system statistics, the Optimizer cant come up with valid cost estimates to evaluate
alternate execution plans.
What Does the Optimizer Do?
The CBO performs several intricate steps to arrive at the optimal execution plan for a users
query. The original SQL statement is most likely transformed, and the CBO evaluates
alternative access paths (for example, full-table or index-based scans). If table joins are
necessary, the Optimizer evaluates all possible join methods and join orders. The Optimizer
evaluates all the possibilities and arrives at the execution plan it deems the cheapest in terms of
total cost, which includes both I/O and CPU resource usage cost.
SQL Transformation
Oracle hardly ever executes your query in its original form. If the CBO determines that a
different SQL formulation will achieve the same results more efficiently, it transforms the
statement before executing it. A good example is where you submit a query with an OR
-
8/12/2019 Oracle Perf Tune - Intro
11/16
condition, and the CBO transforms it into a statement using UNION or UNION ALL. Or your
statement may include an index, but the CBO might transform the statement so it can do a full
table scan, which can be more efficient under some circumstances. In any case, its good to
remember that the query a user wishes to be executed may not be executed in the same form
by Oracle, but the querys results are still the same. Here are some common transformations
performed by the Oracle CBO:
Transform IN into OR statements.
Transform OR into UNION or UNION ALL statements.
Transform noncorrelated nested selects into more efficient joins.
Transform outer joins into more efficient inner joins.
Transform complex subqueries into joins, semi joins, and anti joins.
Star transformation for data warehouse tables based on thestar schema.
Transform BETWEEN to greater than or equal to and less than or equal to statements.
Choosing the Access Path
Oracle can often access the same data through different paths. For each query, the Optimizer
evaluates all the available paths and picks the least expensive one in terms of resource usage.
If joins are involved, then the join order and the join method are evaluated to finally arrive at the
best execution plan. Youll take a brief look at the steps the Optimizer goes through before
deciding on its choice of execution path.
Full Table Scans
Oracle scans the entire table during a full table scan. Oracle reads each block in the table
sequentially, so the full table scan can be efficient if the MULTI_BLOCK_READ_COUNT
-
8/12/2019 Oracle Perf Tune - Intro
12/16
initialization parameter is set high enough. However, for large tables, full table scans are
inefficient in general.
Table Access by ROWID
Accessing a table by ROWID retrieves rows using unique ROWIDs. ROWIDs in Oracle specify
the exact location in the data file and the data block where the row resides, so ROWID access is
the fastest way to retrieve a row in Oracle. Often, Oracle obtains the ROWID through an index
scan of the tables indexes. Using these ROWIDs, Oracle swiftly fetches the rows.
Index Scans
An index stores two things: the column value of the column on which the index is based and the
ROWID of the rows in the table that contain that column value. An index scan retrieves data
from an index using the values of the index columns. If the query requests only the indexed
column values, Oracle will return those values. If the query requests other columns outside the
indexed column, Oracle will use the ROWIDs to get the rows of the table.
Choosing the Join Method
When you need to access data thats in two or more tables, Oracle joins the tables based on a
common column. However, there are several ways to join the row sets returned from the
execution plan steps. For each statement, Oracle evaluates the best join method based on the
statistics and the type of unique or primary keys on the tables. After Oracle has evaluated the
join methods, the CBO picks the join method with the least cost.
The following are the common join methods used by the CBO:
Nested-loop join: A nested-loop join involves the designation of one table as the driving table
(also called the outer table) in the join loop. The other table in the join is called the inner
table. Oracle fetches all the rows of the inner table for every row in the driving table.
Hash join: When you join two tables, Oracle uses the smaller table to build a hash table on
the join key. Oracle then searches the larger table and returns the joined rows from the hash
table.
Sort-merge join: If both the tables in a join are sorted on the join key, the sorted lists are
merged together.
-
8/12/2019 Oracle Perf Tune - Intro
13/16
Choosing the Join Order
Once the Optimizer chooses the join method, it determines the order in which the tables are
joined. The goal of the Optimizer is always to join tables in such a way that the driving table
eliminates the largest number of rows. A query with four tables has a maximum of 4 factorial, or
24, possible ways in which the tables can be joined. Each such join order would lead to a
number of different execution plans, based on the available indexes and the access methods.
The search for an optimal join strategy could take a long time in a query with a large number of
tables, so Oracle depends on an adaptive search strategy to limit the time it takes to find the
best execution plan. An adaptive search strategy means that the time taken for optimization is
always a small percentage of the total time that is taken for execution of the query itself.
Drawbacks of the CBO
The CBO is systematic, but the Optimizer is not guaranteed to follow the same plan in similar
cases. However, the CBO isnt always perfect, and you need to watch out for the following:
Application developers may know more than the CBO when it comes to choosing the best
access path. Application developers know the needs of the users, of which the CBO is
completely unaware. This could lead to a situation where the CBO may be optimizing
throughput, when the users would rather have a quick set of results on their screen. By
using hints such as FIRST_ROWS_n, you can overcome this drawback in the CBO.
The CBO depends enormously on correct statistics gathering. If the statistics are absent or
outdated, the Optimizer can make poor decisions.
Providing Statistics to the CBO
The CBO can follow optimal execution paths only if it has detailed knowledge of the database
objects. In Oracle Database 10g, the recommended way to provide these statistics is by letting
the database automatically collect statistics for you. This is known as the Automatic Optimizer
Statistics Collection feature. You can also manually provide statistics to the
-
8/12/2019 Oracle Perf Tune - Intro
14/16
Optimizer with the DBMS_STATS package.
Using DBMS_STATS to Collect Statistics
Although letting the database automatically collect Optimizer statistics is the recommended
approach under Oracle Database 10g, you can still manually collect Optimizer statistics using
the DBMS_STATS package.
Collecting the Statistics
The DBMS_STATS package has several procedures that let you collect data at different levels.
The main data collection procedures for database table and index data are as follows:
-
8/12/2019 Oracle Perf Tune - Intro
15/16
Frequency of Statistics Collection
Theoretically, if your data is static, you may only need to collect statistics once. If your database
performs only a small amount of DML activities, you may collect statistics at relatively longer
intervals, say weekly or monthly. However, if your database objects go through constant changeon a daily basis, you need to schedule the statistics collection jobs much more frequently, say
daily or even more often. One of the best Oracle Database 10g new features is that you can let
the database decide the frequency of statistics collection. Remember that the database bases
its statistics collection on whether the statistics are fresh or stale. Thus, you can relax and let
the database be the arbiter of how often to collect statistics.
What Happens When You Dont Have Statistics
Youve seen how the Oracle database can automatically collect Optimizer statistics for you.
Youvealso learned how to use the DBMS_STATS package to collect the statistics manually
yourself. But what happens if you disable the automatic statistics collection process, or if you
dont collect statisticsin a timely fashion? Even with automatic statistics collection, under which
necessary statistics are collected on a nightly basis, you may have a situation where table data
is altered after the statistics collection process is over. In situations such as this, Oracle usesdata, such as the number of blocks taken up by the table data and other ancillary information, to
figure out the Optimizer execution plan. You can also use the initialization parameter
OPTIMIZER_DYNAMIC_SAMPLING to let Oracle estimate Optimizer statistics on the fly, when
no statistics exist for a table, or when the statistics exist but are too old or otherwise unreliable.
Of course, sampling statistics dynamically would mean that the compile time for the SQL
-
8/12/2019 Oracle Perf Tune - Intro
16/16
statement involved would be longer. Oracle smartly figures out if the increased compile time is
worth it when it encounters objects without statistics. If its worth it,Oracle will sample a portion
of the objects data blocks to estimate statistics. You need to set thevalue of the
OPTIMIZER_DYNAMIC_SAMPLING initialization parameter to 2 or higher to enable dynamic
sampling of all unanalyzed tables. Because the default for this parameter is 2, dynamic
sampling is turned on by default in your database. Thus, you need not spend sleepless nights
worrying about objects with missing or outdated statistics. In any case, if you adhere to Oracles
recommendation and use the Automatic Optimizer Statistics Collection feature, the
GATHER_STATS_JOB will automatically collect your databases statistics. The
GATHER_STATS_JOB is created at database creation time and is managed by the Oracle
Scheduler, which runs the job when the maintenance window is opened. By default, the
maintenance window opens every night from 10 PM to 6 AM, and all day on weekends. Oracle
will collect statistics for all objects that need them if you adopt the Automatic Optimizer Statistics
Collection feature. The feature is turned on by default in a new Oracle 10g database
or when you upgrade to the 10g release from an older version.
top related