diagnostics with tfa, best practices - doag.org many log files do you have in addition? what is the...

82
Infrastructure at your Service. Diagnostics with TFA, best practices

Upload: vuongkien

Post on 21-Jun-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

Infrastructure at your Service.

Diagnostics with TFA, best practices

Infrastructure at your Service.

30/05/2017 Diagnostics with TFA, best practices Page 2

About me

Daniel Westermann

Senior Consultant

Open Infrastructure Technology Leader

+41 79 927 24 46

[email protected]

Experts At Your Service

> Over 50 specialists in IT infrastructure

> Certified, experienced, passionate

Based In Switzerland

> 100% self-financed Swiss company

> Over CHF 8.4 mio. turnover

Leading In Infrastructure Services

> More than 150 customers in CH, D, & F

> Over 50 SLAs dbi FlexService contracted

dbi services Who we are

Page 3

Best Workplace in Switzerland 2017 Small Companies 20-49 employees, Rank 7

dbi services is hiring ([email protected])

Diagnostics with TFA, best practices 30/05/2017

What is this about

30/05/2017 Page 4

Diagnostics with TFA, best practices

Introduction

Installation and/or upgrades

Who has access?

Best practices

Small demo

Conclusion

Agenda

30/05/2017 Diagnostics with TFA, best practices Page 5

TFA - Introduction

30/05/2017 Page 6

Diagnostics with TFA, best practices

How many log files do you have for an Oracle 12.2 database?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 7

oracle@oelrac1:/u01/app/oracle/diag/rdbms/db1/DB1_1/ [DB1_1] ls –la

drwxr-x---. 2 oracle asmadmin 20 Mar 21 14:55 alert

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 cdump

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 hm

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 incident

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 incpkg

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 ir

drwxr-x---. 2 oracle asmadmin 4096 Mar 21 15:00 lck

drwxr-x---. 7 oracle asmadmin 60 Mar 21 14:55 log

drwxr-x---. 2 oracle asmadmin 4096 Mar 21 15:00 metadata

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 metadata_dgif

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 metadata_pv

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 stage

drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 sweep

drwxr-x---. 2 oracle asmadmin 36864 May 23 08:26 trace

How many log files do you have for an Oracle 12.2 database?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 8

oracle@oelrac1:/u01/app/oracle/diag/rdbms/db1/DB1_1/ [DB1_1] find . -ls | wc -l

2561

How many log files do you have for an Oracle 12.2 Grid Infrastructure?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 9

oracle@oelrac1:/u01/app/oracle/diag/crs/oelrac1/crs/ [DB1_1] ls -la

drwxrwxr-x. 2 oracle oinstall 20 Mar 21 12:59 alert

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 cdump

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 incident

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 incpkg

drwxrwxr-x. 2 oracle oinstall 4096 Mar 21 12:59 lck

drwxrwxr-x. 4 oracle oinstall 29 Mar 21 12:59 log

drwxrwxr-x. 2 oracle oinstall 4096 Mar 21 12:59 metadata

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 metadata_dgif

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 metadata_pv

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 stage

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 sweep

drwxrwxr-x. 2 oracle oinstall 20480 May 23 08:28 trace

How many log files do you have for an Oracle 12.2 Grid Infrastructure?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 10

oracle@oelrac1:/u01/app/oracle/diag/crs/oelrac1/crs/ [DB1_1] find . -ls | wc -l

737

How many log files do you have in addition?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 11

oracle@oelrac1:/u01/app/oracle/diag/ [DB1_1] ls -la

drwxrwxr-x. 3 oracle oinstall 22 Mar 21 13:02 afdboot

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 apx

drwxrwxr-x. 5 oracle oinstall 51 Mar 21 13:04 asm

drwxrwxr-x. 4 oracle oinstall 40 Mar 21 13:04 asmtool

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 bdsql

drwxrwxr-x. 4 oracle oinstall 40 Mar 21 13:05 clients

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 diagtool

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 dps

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 em

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 gsm

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 ios

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 lsnrctl

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 netcman

How many log files do you have in addition?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 12

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 ofm

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 plsql

drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 plsqlapp

drwxrwxr-x. 3 oracle oinstall 20 Mar 21 13:08 tnslsnr

oracle@oelrac1:/u01/app/oracle/diag/ [+ASM1] find . -ls | wc -l

5966

What about the Operating System?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 13

oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /var/log/messages

-rw-------. 1 root root 552760 May 23 08:40 /var/log/messages

oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /var/log/sa/

-rw-r--r--. 1 root root 3192 Nov 2 2016 sa02

...

oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /proc/meminfo

-r--r--r--. 1 root root 0 May 24 08:25 /proc/meminfo

oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /proc/*/

Display all 318 possibilities? (y or n)

1// 14783// 14851// 16093//

...

How do you analyze all of them when you have an issue?

Where do you start when you have issues with the cluster or the database?

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 14

Components – Overview TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 15

hba2

+ASM1

DB11

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1 hba2

+ASM2

DB12

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1 hba2

+ASM3

DB13

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1

cache fusion

VIP1 VIP2 VIP3

SCANVIP3 SCANVIP2 SCANVIP1

SCAN lsnr 1 SCAN lsnr 2 SCAN lsnr 3

SCAN Address

a bunch of disks

jdbc/sqlnet/oci

data ocr/voting

tcp tcp tcp

udp/infiniband/rds

The Oracle Trace File Analyzer is

> a collection of tools to support you in collecting logs and statistics

> can run on single nodes

> can run on all nodes that are part of a cluster

> centralizes collections into a single place

> a daemon that runs in the background all the time (hopefully)

> for this you need root

> a single command line interface to talk to the daemons on all nodes (tfactl)

> a wrapper around all the support tools required for diagnostics

What is the Oracle Trace File Analyzer (TFA) TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 16

https://support.oracle.com

> TFA Collector - TFA with Database Support Tools Bundle (Doc ID 1513912.1)

Where to get started TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 17

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 18

TFA

Collector Analyzer tfactl Tools

?

ORAchk EXAchk oswatcher procwatcher

oratop sqlt alertsummary ls

pstack grep summary vi

tail param dbglevel history

changes RDA / DA

Some of the tools have their on MOS note

> ORAchk - Health Checks for the Oracle Stack (Doc ID 1268927.2)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 19

Some of the tools have their own MOS note

> Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 20

Some of the tools have their own MOS note

> OSWatcher (Includes: [Video]) (Doc ID 301137.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 21

Some of the tools have their own MOS note

> Procwatcher: Script to Monitor and Examine Oracle DB and Clusterware Processes (Doc ID 459694.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 22

Some of the tools have their own MOS note

> oratop - Utility for Near Real-time Monitoring of Databases, RAC and Single Instance (Doc ID 1500864.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 23

Some of the tools have their own MOS note

> All About the SQLT Diagnostic Tool (Doc ID 215187.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 24

Some of the tools have their own MOS note

> Remote Diagnostic Agent (RDA) - Getting Started (Doc ID 314422.1)

Components TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 25

Components – Overview TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 26

hba2

+ASM1

DB11

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1 hba2

+ASM2

DB12

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1 hba2

+ASM3

DB13

NIC3

NIC4 HAIP

NIC1

NIC2 PUB

hba1

cache fusion

VIP1 VIP2 VIP3

SCANVIP3 SCANVIP2 SCANVIP1

SCAN lsnr 1 SCAN lsnr 2 SCAN lsnr 3

SCAN Address

a bunch of disks

jdbc/sqlnet/oci

data ocr/voting

tcp tcp tcp

udp/infiniband/rds

TFA daemon TFA daemon TFA daemon

tfactl (initiator)

Keep TFA up to date

There a quite a few bugs TFA – Introduction

30/05/2017 Diagnostics with TFA, best practices Page 27

TFA – Installation and/or upgrades

30/05/2017 Page 28

Diagnostics with TFA, best practices

Linux x64

> RedHat

> SuSE

> Oracle

Linux Itanium

zLinux

Solaris

> SPARC

> x64

AIX

HPUX

> Itanium

> PA-RISC

Supported platforms TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 29

TFA is supported for

> Oracle database 10.2+

> Grid Infrastructure 10.2+

JRE version 1.5 or higher is required

> Comes with Database and Grid Infrastructure installation anyway

> openJDK is not supported

Supported platforms TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices

Directory layout TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 31

Directory Description

tfa/bin tfactl

tfa/repository Stores collections

tfa/[node]/tfa_home/database Berkeley database

tfa/[node]/tfa_home/diag Tools to troubleshoot TFA

tfa/[node]/tfa_home/diagnostics_to_collect

Files for next collection

tfa/[node]/tfa_home/log TFA logs

tfa/[node]/tfa_home/resources Resource files

tfa/[node]/tfa_home/output Extra metadata about the env.

TFA usually is already there

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 32

oracle@oelrac1:/var/tmp/ [+ASM1] which tfactl

/u01/app/12.2.0.1/grid/bin/tfactl

oracle@oelrac1:/var/tmp/ [+ASM1] tfactl print repository

.-------------------------------------------------------.

| oelrac1 |

+----------------------+--------------------------------+

| Repository Parameter | Value |

+----------------------+--------------------------------+

| Location | /u01/app/oracle/tfa/repository |

| Maximum Size (MB) | 10240 |

| Current Size (MB) | 0 |

| Free Size (MB) | 10240 |

| Status | OPEN |

'----------------------+--------------------------------'

TFA usually is already there

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 33

oracle@oelrac1:/u01/app/oracle/ [DB1_1] ls –la $ORACLE_BASE

drwxr-x---. 7 oracle oinstall 63 Mar 29 10:49 admin

drwxr-x---. 5 oracle oinstall 38 May 13 16:40 audit

drwxrwxr-x. 7 oracle oinstall 70 Mar 21 15:27 cfgtoollogs

drwxr-xr-x. 2 oracle oinstall 6 Mar 21 14:20 checkpoints

drwxrwxr-x. 6 oracle oinstall 60 Mar 21 12:59 crsdata

drwxrwxr-x. 21 oracle oinstall 4096 Mar 21 12:57 diag

drwxr-xr-x. 3 oracle oinstall 20 Mar 21 13:05 diagsnap

drwxr-xr-x. 9 oracle oinstall 4096 Mar 21 13:26 local

drwxr-xr-x. 3 root root 17 Mar 21 13:02 log

drwxr-xr-x. 3 oracle oinstall 24 Mar 21 12:59 oelrac1

drwxr-xr-x. 3 oracle oinstall 19 Mar 21 13:54 product

drwxr-xr-x. 3 oracle oinstall 21 Mar 21 13:43 software

drwxr-x--x. 4 root root 59 Mar 21 12:59 tfa

When you need to update you should install TFA as root

Default installation

TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices

oracle@oelrac1:/var/tmp/ [DB1_1] unzip p21757377_121020_Generic.zip

Archive: p21757377_121020_Generic.zip

inflating: TFA_User_Guide_12.1.2.8.4.pdf

inflating: installTFALite

inflating: README.txt

oracle@oelrac1:/var/tmp/ [DB1_1] sudo ./installTFALite

TFA Installation Log will be written to File :

/tmp/tfa_install_30711_2017_05_23-08_57_35.log

Starting TFA installation

TFA HOME : /u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home

TFA Build Version: 121284 Build Date: 201702061110

Installed Build Version: 122100 Build Date: 201611221703

TFA is already running latest version. No need to patch.

Where to install (it does not go the GI Home by default)?

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 35

oracle@oelora:/var/tmp/ [+ASM] sudo ./installTFALite

TFA Installation Log will be written to File :

/tmp/tfa_install_4525_2017_05_23-09_10_10.log

Starting TFA installation

Enter a location for installing TFA (/tfa will be appended if not

supplied) [/var/tmp/tfa]: ???? Really ???

Where to install (it does not go the GI Home by default)?

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 36

oracle@oelora:/var/tmp/ [+ASM] sudo ./installTFALite –help

-local - Only install on the local node

-deferdiscovery - Discover Oracle trace directories after

installation completes

-tfabase - Install into the directory supplied

-javahome - Use this directory for the JRE

-silent - Do not ask any install questions

-extractto - Extract TFA into the directory supplied (non

daemon mode)

-tmploc - Temporary location directory for TFA to extract

the install archive to (must exist)

-debug - Print debug tracing and do not remove TFA_HOME

on install failure

Where to install (it does not go the GI Home by default)?

Default installation

30/05/2017

TFA – Installation and/or upgrades

oracle@oelora:[+ASM] cd $ORACLE_HOME

oracle@oelora:[+ASM] sudo /var/tmp/installTFALite

TFA Installation Log will be written to File :

/tmp/tfa_install_4834_2017_05_23-09_16_30.log

Starting TFA installation

Enter a location for installing TFA (/tfa will be appended if not

supplied) [/u01/app/12.2.0/grid/tfa]:

Enter a Java Home that contains Java 1.5 or later :

/u01/app/12.2.0/grid/jdk/

Running Auto Setup for TFA as user root...

Page 37

Diagnostics with TFA, best practices

Where to install (it does not go the GI Home by default)?

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 38

Would you like to do a [L]ocal only or [C]lusterwide installation ?

[L|l|C|c] [C] : C

The following installation requires temporary use of SSH.

If SSH is not configured already then we will remove SSH

when complete.

Do you wish to Continue ? [Y|y|N|n] [Y] Y

Installing TFA now...

Discovering Nodes and Oracle resources

Checking whether CRS is up and running

List of nodes in cluster

1. oelora

Where to install (it does not go the GI Home by default)?

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 39

Installing TFA on oelora:

HOST: oelora TFA_HOME: /u01/app/12.2.0/grid/tfa/oelora/tfa_home

.--------------------------------------------------------------------------.

| Host | Status of TFA | PID | Port | Version | Build ID |

+--------+---------------+------+------+------------+----------------------+

| oelora | RUNNING | 6979 | 5000 | 12.1.2.8.4 | 12128420170206111019 |

'--------+---------------+------+------+------------+----------------------'

Sucessfully added 'oracle' to TFA Access list.

.---------------------------------.

| TFA Users in oelora |

+-----------+-----------+---------+

| User Name | User Type | Status |

+-----------+-----------+---------+

| oracle | USER | Allowed |

'-----------+-----------+---------'

Where to install (it does not go the GI Home by default)?

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 40

Summary of TFA Installation:

.----------------------------------------------------------------.

| oelora |

+---------------------+------------------------------------------+

| Parameter | Value |

+---------------------+------------------------------------------+

| Install location | /u01/app/12.2.0/grid/tfa/oelora/tfa_home |

| Repository location | /u01/app/oracle/tfa/repository |

| Repository usage | 0 MB out of 5936 MB |

'---------------------+------------------------------------------'

Cleanup

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 41

oracle@oelora:/u01/app/12.2.0/grid/ [+ASM] ls /usr/tmp/

installTFALite p21757377_121020_Generic.zip README.txt

TFA_User_Guide_12.1.2.8.4.pdf yum-oracle-7lGWtq

oracle@oelora:/u01/app/12.2.0/grid/ [+ASM] ls /var/tmp/

installTFALite p21757377_121020_Generic.zip README.txt

TFA_User_Guide_12.1.2.8.4.pdf yum-oracle-7lGWtq

Changing the default ports

> Change the ports here, deploy to all cluster nodes and restart tfa

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 42

oracle@oelrac1$ sudo cat $GI_HOME/tfa/oelrac1/tfa_home/internal/usableports.txt

5000

5001

5002

5003

5004

5005

TFA can delay server reboots, be careful

Default installation TFA – Installation and/or upgrades

30/05/2017 Diagnostics with TFA, best practices Page 43

TFA – Who has access?

30/05/2017 Page 44

Diagnostics with TFA, best practices

Who has access to the TFA commands?

> The root user can do anything – no surprise here

> Access to a subset of the TFA commands is given to

> The Oracle RDBMS software owner

> The Oracle Grid Infrastructure software owner

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 45

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl access lsusers

Access Denied: Only TFA Admin can run this command

oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl access lsusers

.---------------------------------.

| TFA Users in oelrac1 |

+-----------+-----------+---------+

| User Name | User Type | Status |

+-----------+-----------+---------+

| oracle | USER | Allowed |

'-----------+-----------+---------'

Granting access to other users

You'll have to use "syncnodes" to generate and then synchronize the certificates across all nodes in the cluster

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 46

oracle@oelrac1:$ sudo useradd tfauser

oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl access add -user tfauser

TFA-00103 TFA is not yet secured to run all commands

oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl

tfactl> syncnodes

TFA has not yet generated any certificates on this Node.

Do you want to generate new certificates to synchronize across the

nodes? [Y|N] [Y]: Y

Generating new TFA Certificates...

Once the certificates are in place

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 47

oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl access add -user tfauser

Sucessfully added 'tfauser' to TFA Access list.

.---------------------------------.

| TFA Users in oelrac1 |

+-----------+-----------+---------+

| User Name | User Type | Status |

+-----------+-----------+---------+

| oracle | USER | Allowed |

| tfauser | USER | Allowed |

'-----------+-----------+---------'

But this is not sufficient

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 48

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl

Can't locate Data/Dumper.pm in @INC (@INC contains:

/usr/local/lib64/perl5 /usr/local/share/perl5

/usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl

/usr/lib64/perl5 /usr/share/perl5 .

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/modules

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/exceptions) at

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/tfactlshare.pm

line 770.

BEGIN failed--compilation aborted at

/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/tfactlshare.pm

line 770.

You'll need in addition

Then you can

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 49

oracle@oelrac1:$ sudo usermod -g oinstall tfauser

root@:/home/oracle/ [] su - tfauser

Last login: Tue May 23 10:05:13 CEST 2017 on pts/0

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl

tfactl>

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl analyze

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl diagcollect

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl toolstatus

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl directory

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl print

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl ips

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl run

You cannot

Starting and stopping TFA requires root privileges

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 50

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl stop

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl start

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl set

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl host

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl uninstall

[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl diagnosetfa

[tfauser@oelrac1 ~]$ ps -ef | grep tfa

root /bin/sh /etc/init.d/init.tfa run >/dev/null 2>&1 </dev/null

root /u01/app/12.2.0.1/grid/jdk/jre/bin/java -Xms128m -Xmx512m

oracle.rat.tfa.TFAMa

To remove users

Trace files may contain sensitive data

30/05/2017

TFA – Who has access

Page 51

Diagnostics with TFA, best practices

$ sudo $ORACLE_HOME/bin/tfactl access remove -user tfauser

Sucessfully removed 'tfauser' from TFA Access list.

.---------------------------------.

| TFA Users in oelrac1 |

+-----------+-----------+---------+

| User Name | User Type | Status |

+-----------+-----------+---------+

| oracle | USER | Allowed |

'-----------+-----------+---------'

To reset access permissions to the default

Trace files may contain sensitive data TFA – Who has access

30/05/2017 Diagnostics with TFA, best practices Page 52

oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl access reset

Sucessfully restored to default TFA Access list.

oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl access lsusers

.---------------------------------.

| TFA Users in oelrac1 |

+-----------+-----------+---------+

| User Name | User Type | Status |

+-----------+-----------+---------+

| oracle | USER | Allowed |

'-----------+-----------+---------'

TFA – Best practices

30/05/2017 Page 53

Diagnostics with TFA, best practices

In my 12.2 default installation only one host was available per node

When trying to add the other host

Add all hosts immediately TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 54

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl print hosts

Host Name : oelrac1

oracle@oelrac2:/home/oracle/ [+ASM2] tfactl print hosts

Host Name : oelrac2

root@:$ /u01/app/12.2.0.1/grid/bin/tfactl host add oelrac2

Unable to determine port on which TFA is listening in oelrac2

The only solution that worked in my environment

Re-create the ssh keys for root and ssh-copy-id

Then, fresh install

Add all hosts immediately TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 55

[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl uninstall

[root@oelrac2 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl uninstall

[root@oelrac1 tmp]# ./installTFALite

Enter a location for installing TFA (/tfa will be appended if not

supplied) [/var/tmp/tfa]:

/u01/app/12.2.0.1/grid/tfa

Enter a Java Home that contains Java 1.5 or later :

/u01/app/12.2.0.1/grid/jdk/

After that

… and all tools available

Add all hosts immediately TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 56

[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl print hosts

Host Name : oelrac1

Host Name : oelrac2

[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl toolstatus

| oelrac1 | prw | NOT RUNNING |

| oelrac1 | dbperf | DEPLOYED |

| oelrac1 | oswbb | RUNNING |

| oelrac1 | darda | DEPLOYED |

| oelrac1 | sqlt | DEPLOYED |

Implement sudo

> Either by doing it the easy (but most dangerous) way like me

> Or better be more restrictive on what you want to allow and create a dedicated file for the oracle and grid users

sudo TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 57

root@:/home/oracle/ [] cat /etc/sudoers | grep oracle

oracle ALL=(ALL) NOPASSWD: ALL

root@:/etc/sudoers.d/ [] grep includedir /etc/sudoers

#includedir /etc/sudoers.d

root@:/etc/sudoers.d/ [] touch /etc/sudoers.d/oracle

root@:/etc/sudoers.d/ [] echo "oracle ALL=

/u01/app/12.2.0.1/grid/bin/tfactl" > /etc/sudoers.d/oracle

TFA can be told to automatically monitor for issues

> This is very handy when there is an issue

> All relevant logs are already collected

> All relevant logs are already trimmed around the time of the issue

> All relevant logs are already packaged from all nodes in the cluster

> The default is "ON/true", but better be sure

> When it is off/false

Enable automated collections TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 58

oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl print config | grep Automatic

| Automatic Diagnostic Collection | true |

| Automatic Purging | true |

oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl set autodiagcollect=ON

Events that trigger an automated collection (as of today)

> ORA-297(01|02|03|08|09|10|40)

> ORA-00600

> ORA-07445

> ora-4(69|([7-8][0-9]|9([0-3]|[5-8])))

> ORA-32701

> ORA-494

> System State dumped

> CRS-16(07|10|11|12)

Logfiles monitored

> alert.log (DB,CRS,ASM,ASM Proxy,ASM IO Server)

Enable automated collections TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 59

Take care of the TFA repository

The TFA repository TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 60

oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl print repository

.-------------------------------------------------------.

| oelrac1 |

+----------------------+--------------------------------+

| Repository Parameter | Value |

+----------------------+--------------------------------+

| Location | /u01/app/oracle/tfa/repository |

| Maximum Size (MB) | 10240 |

| Current Size (MB) | 5 |

| Free Size (MB) | 10235 |

| Status | OPEN |

'----------------------+--------------------------------'

Depending on you cluster size and the amount of issues you might want to in- or decrease this

The TFA repository TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 61

oracle@oelrac1:[+ASM1] sudo /u01/app/12.2.0.1/grid/bin/tfactl set

reposizeMB=5000

The minimum recommended repository size is 10 GB.

Directory does not have the space to allocate 10 GB.

Do you wish to continue with current repository size ? [Y/y/N/n] [N] y

Successfully changed repository size

.--------------------------------------------------------.

| Repository Parameter | Value |

+-----------------------+--------------------------------+

| Location | /u01/app/oracle/tfa/repository |

| Old Maximum Size (MB) | 10240 |

| New Maximum Size (MB) | 5000 |

| Current Size (MB) | 5 |

| Status | OPEN |

'-----------------------+--------------------------------'

The default amount of days to keep the logs is 30 days

> You should not lower this when possible

> How often did you need log files from the past that were already gone?

The TFA repository TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 62

oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl print config | egrep

"Purge|purge|Purging"

| Managelogs Auto Purge | false |

| Time interval between consecutive Managelogs Auto Purge(minutes) | 60 |

| Logs older than the time period will be auto purged(days[d]|hours[h]) | 30d |

| Automatic Purging | true |

| Age of Purging Collections (Hours) | 12 |

collect is your friend

> To on demand collect the last three hours

> To on demand collect the last three hours for specific database

On demand collections TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 63

oracle@oelrac1:[+ASM1] tfactl diagcollect -all -since 3h

oracle@oelrac1:[+ASM1] tfactl diagcollect -database DB1 -since 3h

analyze is your friend

> To analyze the last three hours

> To search for all ORA-00600 with the last three hours

On demand analysis TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 64

oracle@oelrac1:[+ASM1] tfactl analyze -since 3h

oracle@oelrac1:[+ASM1] tfactl analyze –search "ORA-00600" -since 3h

oratop is your friend

> Starting oratop with tfactl

oratop TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 65

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> oratop -database DB1

orachk is your friend

> Starting orachk with tfactl

> RAT = RAC Configuration Audit Tool (RACcheck)

orachk TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 66

oracle@oelrac1:[+ASM1] sudo yum install expect.x86_64

oracle@oelrac1:[+ASM1] export RAT_CRS_HOME=/u01/app/12.2.0.1/grid

oracle@oelrac1:[+ASM1] export CRS_HOME=/u01/app/12.2.0.1/grid

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> orachk

CRS stack is running and CRS_HOME is not set. Do you want to set

CRS_HOME to /u01/app/12.2.0.1/grid?[y/n][y]y

orachk is your friend

orachk TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 67

orachk is your friend

orachk TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 68

oswbb is your friend

> Collect OS statistics and generate graphs

> Starting oswbb with tfactl

oswbb - OSWatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 69

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> run oswbb

Enter 1 to Display CPU Process Queue Graphs

Enter 2 to Display CPU Utilization Graphs

Enter 3 to Display CPU Other Graphs

Enter 4 to Display Memory Graphs

Enter 5 to Display Disk IO Graphs

Please Select an Option:4

oswbb is your friend

oswbb - OSWatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 70

prw is your friend

> Deploy it to all nodes in the cluster

> Monitor database and clusterware processes

> For debugging clusterware processes you need

> Linux: gdb

> Solaris: pstack

> AIX: procstack or dbx

> HP-UX: gdb64

prw - Procwatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 71

oracle@oelrac1:/home/oracle/ [+ASM1] sudo yum install –y gdb

oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl

tfactl> prw deploy

Registering clusterware resource

SETTING UP NODE oelrac1

SETTING UP NODE oelrac2

This will create new cluster resources

prw - Procwatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 72

[oracle@oelrac2:/home/oracle/ [grid12201] crsctl stat res –t

procwatcher

1 ONLINE ONLINE oelrac2 STABLE

2 ONLINE ONLINE oelrac1 STABLE

prw is your friend

> You can provide the SID list (The default is derived)

> Restart and log

> Pack prw files for uploading to support

prw - Procwatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 73

[oracle@oelrac1 tmp]# grep SID

/u01/app/oracle/tfa/repository/suptools/prw/root/prwinit.ora | egrep -

v "^#"

SIDLIST=DB1_1

[oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> prw stop

tfactl> prw start

tfactl> prw log

[oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> prw pack

prw is best for

> Session level hangs

> Severe contention in the database

> Instance evictions

> Clusterware or DB process stuck

> ORA-4031 and SGA memory issues

> ORA-4030 and DB process memory issues

> RMAN slow performance issues during backup

prw - Procwatcher TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 74

Avoid: "Please up logfile xx", "Please upload logfile yy"

This will result in

Collection for Service Requests TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 75

oracle@oelrac1:/home/oracle/ [+ASM1] kill -l | grep SIGSEGV

11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM

oracle@oelrac1:/home/oracle/ [+ASM1] ps -ef | grep dbw | grep DB1

oracle 14821 1 0 08:20 ? 00:00:00 ora_dbw0_DB1_1

oracle@oelrac1:/home/oracle/ [+ASM1] kill -11 14821

ORA-07445: exception encountered: core dump [semtimedop()+10]

[SIGSEGV] [ADDR:0xD43100006015] [PC:0x7F746C15DFCA] [unknown code] []

All files you need for the SR are in the referenced zip file

Collection for Service Requests TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 76

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl diagcollect -srdc ora7445

Enter the time of the ORA-07445 [YYYY-MM-DD HH24:MI:SS,<RETURN>=ALL] :

Enter the Database Name [<RETURN>=ALL] : DB1

1. May/24/2017 09:39:52 : [db1] ORA-07445: exception encountered: core

dump [semtimedop()+10] [SIGSEGV] [ADDR:0xD43100006015]

[PC:0x7F746C15DFCA] [unknown code] []

Please choose the event : 1-1 [1] 1

Logs are being collected to:

/u01/app/oracle/tfa/repository/srdc_ora7445_collection_Wed_May_24_09_4

3_07_CEST_2017_node_local

/u01/app/oracle/tfa/repository/srdc_ora7445_collection_Wed_May_24_09_4

3_07_CEST_2017_node_local/oelrac1.tfa_srdc_ora7445_Wed_May_24_09_43_07

_CEST_2017.zip

A menu driven interface

DA/RDA TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 77

oracle@oelrac1:/home/oracle/ [+ASM1] tfactl

tfactl> run darda

A menu driven interface

DA/RDA TFA – Best practices

30/05/2017 Diagnostics with TFA, best practices Page 78

Small demo

30/05/2017 Page 79

Diagnostics with TFA, best practices

Conclusion

30/05/2017 Page 80

Diagnostics with TFA, best practices

Do not rely on the TFA that comes with the database or clusterware installation

> You will miss some tools

Install the complete TFA support bundle from 1513912.1

When you need to troubleshoot issues in your Oracle stack tfa is there to help you

When you need to create Service Requests use tfa to bundle all the required files

30/05/2017

TFA – Conclusion

Page 81

Diagnostics with TFA, best practices

Infrastructure at your Service.

30/05/2017

We look forward to working with you!

Page 82

Daniel Westermann Senior Consultant

Open Infrastructure Technology Leader

+41 79 927 24 46

[email protected]

Any questions? Please do ask

Diagnostics with TFA, best practices