Download - RDB_Parallel Execution.pdf
Extreme Performance with Oracle Database 11g and In-Memory Parallel Execution
Jean-Pierre DijcksData Warehouse Product Management
<Insert Picture Here>
Agenda
• Introduction to Parallel Execution
• Automatic Degree Of Parallelism
• Parallel Statement Queuing
• In Memory Parallel Execution
• Summary
How Parallel Execution works
User connects to the database
User
Background process is spawned
When user issues a parallel SQL statement the background process becomes the Query Coordinator
QC gets parallel servers from global pool and distributes the work to them
Parallel servers - individual sessions that perform work in parallel Allocated from a pool of globally available parallel server processes & assigned to a given operation
Parallel servers communicate among themselves & the QC using messages that are passed via memory buffers in the shared pool
Parallel Servers Parallel Servers do majority of the workdo majority of the work
Query CoordinatorQuery Coordinator
Parallel Execution Plan
IDID OperationOperation Name Name TQTQ ININ--OUTOUT PQ PQ DistributionDistribution
0 SELECT STATEMENT
1 PX COORDINATOR
2 PX SEND QC {RANDOM} Q1,01 P->S
3 HASH JOIN Q1,01 PCWP
4 PX RECEIVE Q1,01 PCWP
5 PX SEND BROADCAST Q1,01 P->P BROADCAST
6 PX BLOCK ITERATOR Q1,01 PCWP
7 TABLE ACCESS FULL CUSTOMERS Q1,01 PCWP
8 PX BLOCK ITERATOR Q1,01 PCWP
9 TABLE ACCESS FULL SALES Q1,01 PCWP
SELECT c.cust_name, s.purchase_date, s.amount
FROM sales s, customers c
WHERE s.cust_id = c.cust_id;
Parallel Execution of a Query
SELECT c.cust_name, s.date, s.amount
FROM sales s, customers c
WHERE s.cust_id = c.cust_id;
Producers
Consumers
ProducersProducersProducersProducers
ConsumersConsumersConsumersConsumers Query CoordinatorQuery Coordinator
Producers and Consumer in the execution plan
IDID OperationOperation Name Name TQTQ ININ--OUTOUT PQ PQ DistributionDistribution
0 SELECT STATEMENT
1 PX COORDINATOR
2 PX SEND QC {RANDOM} Q1,02 P->S
3 HASH JOIN Q1,02 PCWP
4 PX RECEIVE Q1,02 PCWP
5 PX SEND HASH Q1,00 P->P
6 PX BLOCK ITERATOR Q1,00 PCWP
7 TABLE ACCESS FULL CUSTOMERS Q1,00 PCWP
8 PX RECEIVE Q1,02 PCWP
9 PX SEND HASH Q1,01 P->P
10 PX BLOCK ITERATOR Q1,01 PCWP
11 TABLE ACCESS FULL SALES Q1,01 PCWP
Parallel Execution of a Scan
• Data is divided into Granules – block range or partition
• Each Parallel Server is assigned one or more Granules
• No two Parallel Servers ever contend for the same Granule
• Granules are assigned so that the load is balanced across all Parallel Servers
• Dynamic Granules chosen by the optimizer
• Granule decision is visible in execution plan
Full scan of the sales table
PQ 1
PQ 2
PQ 3
Enabling Parallel Execution
There are three ways to enable parallel Execution1. Enable the table(s) for parallel execution:
alter table sales parallel ;alter table customers parallel ;
2. Use a parallel hintselect /*+ parallel(c) parallel(s) */c.state_province, sum(s.amount) revenuefrom customers c, sales swhere s.customer_id = c.idand s.purchase_date=to_date('01-JAN-2007','DD-MON-YYYY')and c.country = 'United States'group by c.state_province;
3. Use alter session force parallel query ;
Controlling Parallel Execution on RAC
Use RAC ServicesCreate two servicesSrvctl add service –d database_name
-s ETL-r sid1, sid2
Srvctl add service –d database_name-s AHOC-r sid3, sid4
ETL Ad-Hoc queries
Note:New Parameter to force a parallel statement to run on just node the query was issued on called PARALLEL_FORCE_LOCAL default FALSE
How could we enhance Parallel Execution?
Current Issues• Difficult to determine ideal DOP for each table without manual tuning• One DOP does not fit all queries touching an object• Not enough PX server processes can result in statement running serial• Too many PX server processes can thrash the system• Only uses IO resources
Solution• Oracle automatically decides if a statement
–Executes in parallel or not and what DOP it will use
–Can execute immediately or will be queued
–Will take advantage of aggregated cluster memory or not
Automatic Degree of Parallelism
Business Requirement• Parallelism is completely manual
– Tuning typically required to determine ideal DOP– Generally reserved for well-defined workload(large SQL)– One DOP does not fit all queries touching an object
Solution• Oracle automatically decides if a statement
– Executes in parallel or not – What DOP the statement will use
Automatic Degree of Parallelism Auto DOP
• Statement with elapse times less than the threshold go serial
• Statement with elapse times greater than threshold are candidates for Parallel
• Optimizer derives the DOP for the statement based on resource requirements for all scans operations
• Applies to all types of statements Query, DML, or DDL• Explain plan has been enhanced to show DOP selected
How Auto Degree of Parallelism works
SQL statement
Statement is hard parsedAnd optimizer determines the execution plan
Statement executes in parallel
Actual DOP = MIN(PARALLEL_DEGREE_LIMIT, ideal DOP)
Statement executes serially
If estimated time less than threshold
Optimizer determines ideal DOP
If estimated time greater than threshold
Controlling Auto DOP
• Controlled by two init.ora parameters:– PARALLEL_DEGREE_POLICY
• Controls whether or not auto DOP will be used• Default is MANUAL which means no Auto DOP• Set to AUTO to enable auto DOP
– PARALLEL_MIN_TIME_THRESHOLD• Controls which statements are candidate for parallelism• Default is 10 seconds
– PARALLEL_DEGREE_LIMIT• Controls maximum DOP that can be used• Default is CPU meaning DEFAULT DOP
Explain plan enhancement for Auto DOP
Plan hash value: 2489314924ID Operation Name Rows Bytes Cost Time Pstart Pstop
0 Select Statement 96000 9889 5 00:00:01
1 PX COORDINATOR
2 PX SEND QC (RANDOM)
:TQ100 96000 9889 5 00:00:01
3 PX BLOCK ITERATOR 96000 9889 5 00:00:01
4 Table Access Full Sales 96000 9889 5 00:00:01 1 16
PLAN_TABLE_OUTPUT
Note- Computed Degree of Parallelism is 16 because of parallel threshold
Parallel Statement Queuing
Business Requirement• With the introduction of Auto DOP
– More statements will run in parallel – Possible to exhaust all parallel execution server processes– Potential system thrashing due to too many processes
Solution• Parallel Statement Queuing
– Oracle automatically decides if a statement can execute immediately or not
– Prevents serializing parallel queries when parallel servers are not available
– Prevents system thrashing
Parallel Statement Queuing
• When a parallel statement starts checks if PX servers are available– Let it run if there are enough PX servers available– Queue the statement if there are not enough PX servers available
• Monitors RAC-wide availability of PX servers– Adaptive to dynamic environments
• Services– The service your session belongs to determines the limits on
queuing • Cluster reconfiguration
– Queue is aware of nodes leaving and joining the cluster and adjusts the limits accordingly
How Parallel Statement Queuing Works
SQL statements
Statement is parsedand oracle automatically determines DOP
If enough parallel servers available execute immediately
If not enough parallel servers available queue the statement
128163264
8
FIFO Queue
When the required number of parallel servers become available the first stmt on the queue is dequeued and executed
128
163264
Controlling Parallel Statement Queuing
• Enabled when PARALLEL_DEGREE_POLICY is set to AUTO• The Statement queue is enforced with a strict FIFO policy• PARALLEL_SERVER_TARGET indicates how many PX servers are
available to run queries before queuing kicks-in– Default values 4 X PARALLEL_THREADS_PER_CPU X CPU_COUNT– This a soft limit and does not replace PARALEL_MAX_SERVERS
PX server 1PX server 1-- 64 available 64 available
to run queries before to run queries before
queuing kicks inqueuing kicks in
Total PX servers availableTotal PX servers available
88
6464
160160
On an 8 CPU system
CPU Count
Parallel Server Target
Parallel Max Server
Controlling Parallel Statement Queuing
• Two new hints– To by-passes parallel statement queuing
SELECT /*+ NO_STMT_QUEUING */ col1 FROM foo;– To delay statement execution until resources are available without having
PARALLEL_DEGREE_POLICY is set to AUTOSELECT /*+ STMT_QUEUING */ col1 FROM foo;
• V$SQL_PLAN_MONITOR has a new status value for SQL that is queuedSELECT s.sql_id, s.sql_textFROM v$SQL_MONITOR m, v$SQL sWHERE m.status='QUEUED’AND m.sql_id = s.sql_id;
• Two new wait events– PX Queuing: Statement queue
• Indicates the first query in the queue– ENQ JX SQL statement queue
• All other queries in the queue wait on this enqueue
Monitoring Statement Queuing in EM
Awaiting screen shot from EMClock
symbol indicated a queued statement
Click on the SQL ID for more info
In-Memory Parallel Execution
Business Requirement• Traditionally Parallel Execution takes advantage of
the IO capacity of a system• Disk speeds are not keeping up with Moore’s law
while CPU and Memory are
Solution• In-Memory Parallel Execution harness the memory
capacity of the entire system• Scan data nearly 10 X faster than scanning from disk
How In-Memory Parallel Execution Works
• Detect if the object fits in the aggregated buffer cache of the cluster– If so, distribute & affinitizes the blocks among the nodes and make
PQ aware of the affinity– If not, continue to by-pass the buffer cache and read directly from
disk
• Subsequent access to the object will be conducted only by PX servers on the node to each the data was affinitized
How In-Memory Parallel Execution Works in detail
• Decision to use the buffer cache is based on set of heuristics including– Ratio between buffer cache size and object size– Frequency at which the object is accessed– How much the object changes between accesses
• In RAC fragments of the object are affinitized in the buffer cache on each of the active instances– Affinity is based on FileNumber and ExtentNumber unless hash
partitioned– Automatically prevents multiple instances reading the same data
from disk– Only PX process on the same RAC node can access each of the
fragments of the object
How In-Memory Parallel Execution Works
SQL statement
Determine the size of the table being looked at
Table is extremely Large
Always use direct read from disk
Table is a good candidate for In-Memory Parallel Execution
Fragments of Table are read into each node’s buffer cache
Only parallel server on the same RAC node will access each fragment
Read into the buffer cache on any node
Table is extremely small
Controlling In-Memory Parallel Execution
• Controlled by PARALLEL_DEGREE_POLICY– Active only when set to AUTO– No way to turn it off
When to Introduce Auto Parallel Execution
PARALLEL_DEGREE_POLICY has three possible modes• Manual
– As before, DBA must manually specify all aspects of parallelism– No AUTO DOP, Stmt Queuing, In-Memory Parallel Execution– Useful for well-understood existing applications
• Limited– Restricted AUTO DOP for queries with tables decorated with default
PARALLEL – No Stmt Queuing, In-Memory Parallel Execution – Useful in a mixed-world environment when a limited number of
statements would benefit from parallel execution• Auto
– All qualified statements subject to executing in parallel– Statements can be queued– IN-memory PQ available– Useful when deploying new applications in 11g that would benefit from
parallel execution
New Parallel Execution Init.ora Parameters
Parameter Value Description
PARALLEL_DEGREE_LIMIT CPU Max DOP that can be selected with AUTO DOP
PARALLEL_DEGREE_POLICY MANUAL Specifies if AUTO DOP, Queuing, & In-memory PE will be enabled
PARALLEL_FORCE_LOCAL FALSE Restricts parallel server processes to the node where query is issued
PARALLEL_MIN_TIME_THRESHOLD AUTO Specifies min execution time a statement should have before AUTO DOP will kick in
PARALLEL_SERVERS_TARGET 4*CPU_COUNT* PARALLEL_THREAD
S_PER_CPU * ACTIVE_INSTANCES
Specifies # of parallel processes allowed to run parallel stmts before queuing will be use
PARALLEL_EXECUTION_MESSAGE_SIZE 16KB Specifies size of the message buffers used for communication
For More Information
search.oracle.com
Parallel Execution
Orhttp://www.oracle.com/technology/products/bi/db/11g/pdf/twp_parallel_execution_fundamentals_11gr2.pdf