oracle 12c new features for better performance

Zohar Elkayam

www.realdbamagic.com

Twitter: @realmgic

Oracle 12c New Features for Better Performance

Who am I?

• Zohar Elkayam, CTO at Brillix

• Programmer, DBA, team leader, database trainer, public speaker, and a senior consultant for over 19 years

• Oracle ACE Associate

• Member of ilOUG – Israel Oracle User Group

• Blogger – www.realdbamagic.com and www.ilDBA.co.il

2 http://brillix.co.il

http://www.realdbamagic.com/

http://www.ildba.co.il/

About Brillix

• We offer complete, integrated end-to-end solutions based on best-of-breed innovations in database, security and big data technologies

• We provide complete end-to-end 24x7 expert remote database services

• We offer professional customized on-site trainings, delivered by our top-notch world recognized instructors

3

Some of Our Customers

http://brillix.co.il4

Agenda

• Database In Memory (column store) – 12.1.0.2

• Oracle Database Sharding – 12.2.0.1

• Optimizer and Statistics changes – 12c


Our Goal for Today

• Getting to know some of Oracle 12cR1 and 12cR2

new features around performance

• Not a lot of syntax today – mainly concepts

• Way too many slides, let’s try to catch ‘em all…


Oracle Database In-Memory (Column Store)12.1.0.2


What is an In Memory Database?

• In memory databases are management systems that keeps the data in a non-persistent storage (RAM) for faster access

Examples:• AeroSpike

• SQLite

• MemcacheDB

• Oracle TimesTen and Oracle Coherence


What is a Column Store Database?

• Column Store databases are management systems that use data managed in a columnar structure format for better analysis of single column data (i.e. aggregation). Data is saved and handled as columns instead of rows.

Examples:• Apache Cassandra• Apache HBase• Apache Parquet• Sybase IQ• HP Vertica


How Records are Organized?

• This is a logical table in RDBMS

• Its physical organization is just like the logical one: column by column, row by row


Row 1

Row 2

Row 3

Row 4

Col 1 Col 2 Col 3 Col 4

Query Data

• When we query data, records are read at the

order they are organized in the physical structure

• Even when we query a single

column, we still need to read the

entire table and extract the column


Row 1

Row 2

Row 3

Row 4

Col 1 Col 2 Col 3 Col 4

Select Col2 From MyTable

Select *From MyTable

How Does Column Stores Keep Data

Organization in row store Organization in column store


Select Col2 From MyTable

Row Format vs. Column Format


In Memory Option Breakthrough

• In memory option introduces a dual format database

• Tables can be accessed as row format and column format at the same time – the Optimizer is aware to the new format so:

• OLTP continue using the old row format• Analytic queries start using the column format


Oracle In Memory Option

• Column data is pure in memory format: it’s non-

persistent and require no logging, archiving or backup

• Data changes are simultaneously changed in both

formats so data is consistent and current

• Application code requires no changes – just turn on

and start using


In Memory Option – Good To Know

• It is Not “In Memory Database” – it’s an accelerator to the

regular database

• It is Not “Column Store Database” – column organized data

is non-persistent*

• In Memory Option requires more memory than the data you

plan to load to the memory: no LRU mechanism

• Not related to Oracle Times-Ten or Oracle Coherence


Oracle Buffer Cache and Memory Management

• Oracle buffer cache can keep data blocks in memory

for optimization

• Blocks are removed from memory based on their

usability (LRU)

• If data is smaller than available memory, we can use

Oracle 12c new features: Full Database Caching


Full Database Caching

• Full Database Caching: Implicit default and automatic mode

in which an internal calculation determines if the database

can be fully cached

• Force Full Database Caching: This mode requires the DBA

to execute the ALTER DATABASE FORCE FULL DATABASE

CACHING command

• Neither Full Database Caching nor Force Full Database

Caching forces prefetch of data into the memory


What’s new In 12cR2?

• In memory support for Active Data Guard configuration

• In memory virtual columns and expressions

• In memory FastStart

• Automatic Data Optimization Support for In-Memory

Column Store


Oracle Sharding12.2.0.1


Scaling Databases

• Why would we want to scale our database• Performance

• Elasticity

• Global data distribution

• Possible solutions:• Scaling up – adding more hardware

• Scaling out – the Oracle way, using RAC

• Scaling out using sharding


What Is Sharding?

• Sharding is a way of horizontal scaling (horizontal

partitioning)

• Instead of scaling the database infrastructures, we

scale out the data itself

• Not a new concept: MongoDB, Cassandra, MySQL…

• Starting with Oracle 12.2 we can use Sharded

Database Architecture (SDA) as part of Oracle

Global Data Services (GDS) architecturehttp://brillix.co.il22

Global Data Services (GDS)


Sharded Database Architecture (SDA)

• Part of the Global Data Services (GDS) architecture

• Databases in the logical database doesn’t share any

physical resources or clusterware software

• Databases can reside in different geo-locations

• Application must be compatible with sharded behavior


Benefits of Sharding

• Linear Scalability - eliminates performance bottlenecks and makes it

possible to linearly scale performance by adding shards

• Fault Containment - Sharding is a shared nothing hardware

infrastructure that eliminates single points of failure

• Geographical Distribution of Data - store data close to its users

• Rolling Upgrades – changes to one shard at a time does not affect

other shards

• Simplicity of Cloud Deployment - supports on-premises, cloud, and

hybrid deployment models


Why RDBMS Sharding?

• Unlike NoSQL sharding, Oracle Shards still support• Relational schemas

• ACID transactions properties and read consistency

• SQL and other programmatic interfaces

• Complex data types

• Database partitioning

• Advanced security

• High Availability features

• And more…


The Big Picture


Server A – Non-Sharded

Sharding Methods

• We can use two methods of sharding data:• Sharded tables: data exist is one shared

• Duplicated tables: data exist in all shareds


SDB – Sharded (Logical) Database

Server

B

Server

CServer

D

Shard 1 Shard 2 Shard 3

Example – Sharded Table Creation


CREATE SHARDED TABLE customers ( cust_id NUMBER NOT NULL, name VARCHAR2(50), address VARCHAR2(250), region VARCHAR2(20), class VARCHAR2(3), signup DATECONSTRAINT cust_pk PRIMARY KEY(cust_id))PARTITION BY CONSISTENT HASH (cust_id)TABLESPACE SET ts1PARTITIONS AUTO;

Example – Duplicated Table Creation


CREATE DUPLICATED TABLE Products ( StockNo NUMBER PRIMARY KEY, Description VARCHAR2(20), Price NUMBER(6,2)));

Sharded Table Families

• We can shard multiple tables to the same database

shard using table families

• All tables in a table family must have the same equi-

partition sharding key:

• Using Reference partitions

• Using the PARENT clause


Example – Sharded Table Family Creation (REF)


CREATE SHARDED TABLE Customers ( CustNo NUMBER NOT NULL, Name VARCHAR2(50), Address VARCHAR2(250) , CONSTRAINT RootPK PRIMARY KEY(CustNo))PARTITION BY CONSISTENT HASH (CustNo)PARTITIONS AUTOTABLESPACE SET ts1;

CREATE SHARDED TABLE Orders ( OrderNo NUMBER NOT NULL, CustNo NUMBER NOT NULL, OrderDate DATE, CONSTRAINT OrderPK PRIMARY KEY (CustNo, OrderNo), CONSTRAINT CustFK FOREIGN KEY (CustNo) REFERENCES Customers(CustNo) )PARTITION BY REFERENCE (CustFK);

Example – Sharded Table Family Creation (PARENT)


CREATE SHARDED TABLE Customers ( CustNo NUMBER NOT NULL, Name VARCHAR2(50), Address VARCHAR2(250) , region VARCHAR2(20), class VARCHAR2(3), signup DATE)PARTITION BY CONSISTENT HASH (CustNo)TABLESPACE SET ts1PARTITIONS AUTO;

CREATE SHARDED TABLE Orders ( OrderNo NUMBER, CustNo NUMBER, OrderDate DATE)PARENT CustomersPARTITION BY CONSISTENT HASH (CustNo)TABLESPACE SET ts1PARTITIONS AUTO;

Non-Table Objects

• We can create non-table objects in the logical

databases

• Schema objects: users, roles, views, indexes, synonyms,

functions, procedures, and packages

• Non-schema objects: tablespaces, tablespace sets,

directories, and contexts

• Objects will be created on all shards


DDL Execution

• The application schema name and all objects name must be identical on all shards

• DDL on sharded table must be done from the Shared catalog database or using GDS command line tool (GDSCTL)

• Changes are automatically propagated to all shards


SQL> CONNECT SYS@SH_CATALOGSQL> ALTER SESSION ENABLE SHARD DDL;SQL> CREATE USER <app_name>...SQL> GRANT CREATE TABLE TO <app_name>...SQL> CREATE DUPLICATED TABLE <name>...SQL> CREATE SHARDED TABLE <name>...

GDSCTL> sql "CREATE USER ..."GDSCTL> sql "CREATE TABLESPACE

SET ..."

Sharding Physical Structure

• Physical data distribution based on chunks – each

chunk is one table partition

• Each chunk is located on a different tablespace

• Tablespaces are defined using tablespace sets

(tablespace templates)


Resharding and Hotspots Handling

• Adding/Removing shards or hotspot elimination requires

chunk movement (automatically or manually)

• This will generate an RMAN backup, restore and recovery

of the chunk (tablespace) in the new node. Old chunk will

be automatically removed once done.

• We can also split hotsposts using GDSCTL split command


GDSCTL> MOVE CHUNK -CHUNK 12 -SOURCE sh01 -TARGET sh12

GDSCTL> SPLIT CHUNK -CHUNK 12

Sharding High Availability

• Data replication with Data Guard is a crucial

component in SDB environment

• High availability, disaster recovery, read offloading

• Replication deployment performed fully automatically

• The logical unit of data replication is a shardgroup


High Availability Setup Example


GDSCTL> create shardcatalog -database shdard01:1521:repo-chunks 12 -user mygdsadmin/<pwd> -sdb sharddb-region london,Amsterdam –repl DG –sharding system-protectmode maxavailability...GDSCTL> add shardgroup -shardgroup shardgrp1 -deploy_as primary-region londonGDSCTL> add shardgroup -shardgroup shardgrp2 -deploy_as active_standby-region londonGDSCTL> add shardgroup -shardgroup shardgrp3 -deploy_as active_standby-region amsterdam

Session Routing (single shard)

• Application must be compatible with sharding architecture

• When connecting to the database, the application must

provide the sharding key (and super key) to the connection

• All SQL operations in this session are related to the specified

sharding key (shard)

• To work on another sharding key value, the application needs to

create a new session


Statement Routing/Cross-Shard Query

• Client connection to the Coordinator (Catalog)

Database is required

• No sharding key necessary in the connect descriptor

• Cross-shard SQL are executed via DB Link to Shards

• Partition and Shard pruning


Optimizer Changes and Adaptive Query Optimization 12.1.0.2 + 12.2.0.1


Adaptive Query Optimization


Adaptive Query

Optimization

Adaptive PlansAdaptive

Statistics

At compile

time

At run timeJoin

Methods

Parallel

distribution

Methods

Adaptive Execution Plans (12.1)

• Allows the Optimizer to make runtime adjustments to execution plans and to discover additional information that can lead to better statistics

• Good SQL execution without intervention

• Final plan decision is based on rows seen during execution

• Bad effects of skew eliminated


Adaptive Execution Plans: Join Methods

• Join method decision deferred until runtime• Default plan is computed using available statistics

• Alternate sub-plans are pre-computed and stored in the cursor

• Statistic collectors are inserted at key points in the plan

• Final decision is based on statistics collected during execution

• Possible sub-plans are nested loop joins or hash joins and vice versa


Displaying the Default Plan

• Explain plan command

always shows default plan

• Example shows a nested

loops join as default plan

• No statistics collector

shown in plan


Displaying the Final Plan

• After the statement has completed use DBMS_XPLAN.DISPLAY_CURSOR

to see the final plan selected

• Example shows that hash join picked at execution time

• Again the statistics collector is not visible in the plan


Displaying Plan With +adaptive & +report Formats

• Additional information displayed on why operations are inactive can be

seen with format parameter ‘+report’


Adaptive Execution Plans In V$SQL


Dynamic Statistics (12.1 11.2.0.4)

• During compilation optimizer decides if statistics are

sufficient to generate a good plan or not

• Dynamic statistics are used to compensate for

missing, stale, or incomplete statistics

• They can be used for table scans, index access, joins

and group by

• One type of dynamic statistics is dynamic sampling


Dynamic Statistics

• Dynamic sampling has a new level 11(AUTO)

• Decision to use dynamic sampling depends on the complexity of predicate, existing statistics and total execution time

• Dynamic statistics shared among queries


Adaptive Statistics/Statistics Feedback

Re-optimization Pre 12c:

• During execution optimizer estimates are compared to execution statistics

• If statistics vary significantly then a new plan will be chosen for subsequent executions based on execution statistics

• Re-optimization uses statistics gathered from previous executions

Re-optimization in 12c

• Join statistics are also monitored

• Works with adaptive cursor sharing for statement with binds

• New Column in V$SQL IS_REOPTIMIZABLE

• Information found at execution time is persisted as SQL Plan Directives


Statistics Feedback


Re-optimization – indicator in V$SQL

• New column in V$SQL: IS_REOPTIMIZABLE

• Indicates that the statement will be re-parsed on the

next execution


More Optimizer Changes…

• Adaptive Statistics/Statistics Feedback (12.1)

• Concurrent Execution of UNION and UNION ALLBranches (12.1)

• Cost-Based OR Expansion Transformation (12.2)

• Enhanced Join Elimination (12.2)

• Approximate Query Processing (12.1 + 12.2)


Statistics12.1.0.2 + 12.2.0.1


Histograms

• Histograms tell the Optimizer about the data distribution in a Column for better cardinality estimations

• Default create histogram on any column that has been used in the WHERE clause or GROUP BY of a statement AND has a data skew

• Oracle 12c changes histograms methods:• Top-Frequency (new)

• Height balanced (obsolete)

• Hybrid (new)


Histograms: Top Frequency

• Traditionally a frequency histogram is only created if NDV < 254

• But if a small number of values occupies most of the rows (>99% rows), creating a frequency histograms on that small set of values is very useful even though NDV is greater than 254

• Ignores the unpopular values to create a better quality histogram for popular values

• Built using the same technique used for frequency histograms

• Only created with AUTO_SAMPLE_SIZE


Top Frequency Histogram Example

• Table PRODUCT_SALES contains information on Christmas ornament sales

• It has 1.78 million rows

• There are 620 distinct TIME_IDs

• But 99.9% of the rows have less than 254 distinct TIME_IDs

TIME_ID column perfect

candidate for top-frequency

histogram


Height Balanced Histograms (obsolete)

• A height balanced histogram is created if the number

of distinct values in a column (NDV) is greater than

254 values. This is now obsolete.

Height balanced histogram


Hybrid Histograms

• Hybrid histogram is created if the number of distinct

values in a column (NDV) is greater than 254 values

but uses actual frequencies of bucket endpoints

Hybrid histogram


Hybrid Histograms

• Similar to height balanced histogram as created if the NDV >254

• Store the actual frequencies of bucket endpoints in histograms

• No values are allowed to spill over multiple buckets

• More endpoint values can be squeezed in a histogram

• Achieves the same effect as increasing the # of buckets

• Only created with AUTO_SAMPLE_SIZE


Height-balanced versus Hybrid Histogram

Oracle Database 11g Oracle Database 12c


Session Private Statistics for GTT’s

• GTT’s had only one set of statistics that were shared among all sessions even though the table could contain different data in different sessions

• Starting Oracle 12c, GTT’s now have session private statistics, which is a different set of statistics for each session

• Queries against GTT use statistics from their own session

• Improves the performance and manageability of GTT’s

• Reduces the possibility of errors in the cardinality estimates for GTT’s and ensures that the optimizer has the data to generate optimal execution plans


Online Statistics Gathering for Bulk Loads

• Table statistics are gathered automatically during bulk loads:

• CREATE TABLE AS SELECT• INSERT INTO … SELECT

• Improved performance: avoids an additional table scan to gather table statistics

• Improved manageability: no user intervention is required to gather statistics after a bulk load

• To disable use hint: NO_GATHER_OPTIMIZER_STATISTICS


Optimizer Statistics Advisor (12.2)

• Optimizer Statistics Advisor is built-in diagnostic software that

analyzes the quality of statistics and statistics-related tasks


Optimizer Statistics Advisor (12.2)

• The advisor automatically diagnoses problems in the

existing practices for gathering statistics

• The advisor does not gather a new or alternative set of

optimizer statistics

• The output of the advisor is a report of findings and

recommendations


What Can Go Wrong With Statistic Gathering?

• Legacy scripts may not keep pace with new best

practices, which can change from release to release

• Resources are wasted on unnecessary statistics

gathering

• Statistics can sometimes be missing, stale, or incorrect

• Automatic statistics gathering jobs do not guarantee

accurate and up-to-date statistics


Optimizer Statistics Advisor: Output Example


----------------------------------------------------------------------------------------------------GENERAL INFORMATION-------------------------------------------------------------------------------

Task Name : MY_TASKExecution Name : EXEC_52Created : 12-07-16 11:31:40Last Modified : 12-07-16 11:32:37-------------------------------------------------------------------------------SUMMARY-------------------------------------------------------------------------------For execution EXEC_52 of task MY_TASK, the Statistics Advisor has 6finding(s). The findings are related to the following rules: USECONCURRENT,AVOIDSETPROCEDURES, USEDEFAULTPARAMS, USEGATHERSCHEMASTATS, AVOIDSTALESTATS,UNLOCKNONVOLATILETABLE. Please refer to the finding section for detailedinformation.-------------------------------------------------------------------------------FINDINGS-------------------------------------------------------------------------------...

Optimizer Statistics Advisor: Output Example (2)


-------------------------------------------------------------------------------FINDINGS-------------------------------------------------------------------------------Rule Name: UseConcurrentRule Description: Use Concurrent preference for Statistics CollectionFinding: The CONCURRENT preference is not used.

Recommendation: Set the CONCURRENT preference.Example:dbms_stats.set_global_prefs('CONCURRENT', 'ALL');Rationale: The system's condition satisfies the use of concurrent statistics

gathering. Using CONCURRENT increases the efficiency of statisticsgathering.

----------------------------------------------------...

More Statistics Features

• Concurrent statistics gathering (12.1)

• Automatic Column Group Detection for extended

statistics (12.2)

• Enhancements to Incremental Statistics

• Enhancements to System Statistics

• More…


Q&A


Summary

• We talked about DBIM and the column store solution

• We overviewed the new Sharding solution

• We looked into new Optimizer and Statistics changes

• 12c has a lot to offer us, try it – use it!

• 12cR2 release date for on-prem usage: March 15, 2017

(March 1st for Exadata)


What Did We NOT Talk About

• SQL Plan Management framework

• Automatic Plan Evolution

• Enhanced Auto Capture

• Capture from AWR Repository

• Indexing, Partitioning, and many other performance

related new features…


Thank You

Zohar Elkayamtwitter: @[email protected]

www.realdbamagic.com


mailto:[email protected]

oracle 12c new features for better performance

Technology