rmoug 2012 - mining the awr

Post on 17-Jun-2015

2.582 Views

Category:

Technology

9 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mining the AWR repository for Capacity Planning,

Visualization, & other real world stuff

Presented by: Karl Arao

Who am I?

• Karl Arao, Oracle ACE, OCP-DBA, RHCE• Currently @ Enkitec - Senior Technical Consultant • Formerly @ SQL*Wizard - Solutions Architect• Blog: http://karlarao.wordpress.com• Wiki: http://karlarao.tiddlyspot.com

What will I talk about?

Overwhelming

AWR HELL

DBA_HIST_* views

My first close encounter

gc block lost – sudden slow down

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost

gc block lost – sudden slow down

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost

gc block lost – sudden slow down

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost

gc block lost – sudden slow down

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost

After gc block lost – normal workload

http://karlarao.wordpress.com/2009/06/07/diagnosing-and-resolving-gc-block-lost

Utilization = Requirement / Capacity

Double Y Axis Graph

t0 -------------------------------------> t1335 – 336 – 337 – 338 – 339

delta issue

AWR Scripts

Visualization

Can’t go back in time?

AAS – Average Active SessionsKyle Hailey: http://www.perfvision.com/ftp/class/02_AAS.ppt

Max CPU

Max CPU

AAS – the Golden MetricAAS & CPU count as a yardstick for a possible performance problem: if AAS < 1 -- Database is not blocked AAS ~= 0 -- Database basically idle -- Problems are in the APP not DB AAS < # of CPUs -- CPU available -- Database is probably not blocked -- Are any single sessions 100% active? AAS > # of CPUs -- Could have performance problems AAS >> # of CPUS -- There is a bottleneck

AAS from V$ACTIVE_SESSION_HISTORYAAS = Sample Count / Elapsed Time = 19410 / 600 = 32.35

CPU count = 4

AAS from DBA_HIST_ACTIVE_SESS_HISTORYAAS = (Sample Count * 10) / Elapsed Time = (1950 * 10) / 600 = 32.5

CPU count = 4

AAS from AWR ReportAAS = DB Time / Elapsed Time = 291.81 / 9.10 = 32.07

CPU count = 4

AAS from AWR Top EventsAAS = DB Time / Elapsed Time 291.81 / 9.10 = 32.07

AAS = Event Time / Elapsed Time 17410 / 546 = 31.9

CPU count = 4

awr_topevents.sql

Textual trends

AAS on CPU bound workloadCPU count = 4

load average = 100+

DB Timesnap0 snap1

Direct Path Read

runqueue

CPU

Direct Path Read

runqueue

CPU

Direct Path Read

runqueue

CPU

Direct Path Read

runqueue

CPU

Direct Path Read

CPU

DB Timesnap0 snap1

Direct Path Read

CPU

Direct Path Read

CPU

Direct Path Read

CPU

Direct Path Read

CPU

Direct Path Read

CPU

dba_hist_system_event • Will always account for the waits

dba_hist_sys_time_model• ASH will see the session as ON CPU when on run queue but time model only counts

real CPU cycles• When database foreground goes from on-CPU to on-run queue to on-CPU the time

spent on run queue will be added to DB Time but not visible in DB CPU or in any wait event

CPU Wait = DB Time – ( sum(events) + DB CPU )

DB Timesnap0 snap1

Direct Path Read

CPUWait

CPU

Direct Path Read

CPUWait

CPU

Direct Path Read

CPUWait

CPU

Direct Path Read

CPUWait

CPU

Direct Path Read

CPU

CPU wait is the “unaccounted for DB Time” similar to

the “unaccounted-for time” in 10046 trace brought by CPU starvation

http://karlarao.tiddlyspot.com/#%5B%5BAAS%20investigation%5D%5D

AAS throughout the AWR retention period!

http://karlarao.wordpress.com/2010/07/25/graphing-the-aas-with-perfsheet-a-la-enterprise-manager

Capacity Planning

Utilization is the ultimate metric!

awr_genwl.sql

http://karlarao.wordpress.com/2010/01/31/workload-characterization-using-dba_hist-tables-and-ksar

U = R / C

where aas > 1

Filter the data points• AAS range

aas > 1

• Per SNAP_ID or range of SNAP_IDsid in (336)where id >= 336 and id <= 340

• Oracle CPU Utilizationoracpupct > 50

• OS CPU Utilizationoscpupct > 50

• Workload periods

AND TO_CHAR(s0.END_INTERVAL_TIME,'D') >= 1 -- Day of week: 1=Sunday 7=Saturday AND TO_CHAR(s0.END_INTERVAL_TIME,'D') <= 7 AND TO_CHAR(s0.END_INTERVAL_TIME,'HH24MI') >= 0900 -- Hour AND TO_CHAR(s0.END_INTERVAL_TIME,'HH24MI') <= 1800 AND s0.END_INTERVAL_TIME >= TO_DATE('2010-jan-17 00:00:00','yyyy-mon-dd hh24:mi:ss') -- Data range AND s0.END_INTERVAL_TIME <= TO_DATE('2010-aug-22 23:59:59','yyyy-mon-dd hh24:mi:ss‘)

core need = # of cores * utilization * 1.25Database Consolidation Best Practices

http://husnusensoy.files.wordpress.com/2010/05/database-consolidation-best-practices.pdf

Total disk IOPS = (IOPS * Read Ratio) + (IOPS * Write Ratio * RAID penalty)

Number of disk = Total disk IOPS / IOPS per disk

awr_iowl.sql

Average latency issue

60 minutes interval 10 minutes interval

latency (ms) = (readtim / phy reads) * 10

IO waits latency datafiles latency

SAN filesystem latency

Linear Regression

x data (CPU) = is the "independent value", used to predict the value of y

y data (AAS) = is the "dependent value", variable whose value is to be predicted

r2toolkit

Uses the following inbuilt Oracle functions:

•regr_count•regr_r2•regr_intercept•regr_slope

r2toolkit

Linear Regression – what’s the value?

Linear Regression on 2 node RAChttp://karlarao.tiddlyspot.com/#r2project

racnode1 racnode2

Drilling down on the peak workload... with AAS of 10

Drilling down on the peak workload... with AAS of 10

Now on the low workload period... with AAS of 2.2

Now on the low workload period... with AAS of 2.2

Recap

• Mine the beautiful data set

• Visualization tell a story immediately

• Statistics to make sense of data

Let the AWR data set

change your mind set!

Thank you!

References and Tools• http://karlarao.wordpress.com

– http://karlarao.tiddlyspot.com/#%5B%5BStorage%20IOPS%2Ccapacity%2Cperformance%2Ccost%5D%5D

– http://karlarao.tiddlyspot.com/#Statistics– http://karlarao.tiddlyspot.com/#OraclePerformance

• Tanel Poder @ http://blog.tanelpoder.com– http://www.tanelpoder.com/files/TPT_public.zip– http://www.tanelpoder.com/files/PerfSheet.zip– Neil Gunther & Tanel Poder - Multidimensional Visualization of Oracle Performance using

Barry007 http://arxiv.org/pdf/0809.2532• Kyle Hailey @ http://ashmasters.com , http://www.perfvision.com• Craig Shallahamer @ orapub.com

– Introduction To Oracle Server Consolidationhttp://resources.orapub.com/product_p/server_consolidation_ppt.htm

• Husnu Sensoy @ husnusensoy.wordpress.com – Database Consolidation Best Practiceshttp://husnusensoy.files.wordpress.com/2010/05/database-consolidation-best-practices.pdf

• Andy Rivenes @ http://www.appsdba.com/pubs.htm• Neeraj Bhatia @ www.nioug.org/files/Linear_Regression.pdf

Contact me through:

karl.arao@enkitec.com

101

Questions?

Fastest Growing Companies in Dallas

top related