luw 4 dama-upc ibm db2pd monitoring

80
Session: Problem Determination and Database monitoring using db2pd Jorge Daniel Vaquero Jairo Balart DAMA - UPC

Upload: manikandan-kamatchi-kamatchi

Post on 18-Jan-2016

62 views

Category:

Documents


18 download

DESCRIPTION

SAP BASIS

TRANSCRIPT

Page 1: LUW 4 DAMA-UPC IBM Db2pd Monitoring

Session:

Problem Determination and Database monitoring using db2pd

Jorge Daniel Vaquero

Jairo Balart

DAMA - UPC

Page 2: LUW 4 DAMA-UPC IBM Db2pd Monitoring

2

"What's happening in the engine”

Monitoring scenarios

– Presented to better understand requirements

Tools available in v8.2 and v9.1

– System monitor

– db2pd

db2pd: tool to monitor and troubleshoot DB2– Standalone utility shipped with the DB2 engine starting with the DB2 v8.2

– Used by customers to monitor and troubleshoot

– Gives the user a closer view into the DB2 engine

Advantages of using db2pd– Tool collects information without acquiring any latches or using any engine resources

which has two major benefits

– Faster retrieval

– No competition for engine resources

Page 3: LUW 4 DAMA-UPC IBM Db2pd Monitoring

3

Determine whether instance or database is up

"db2pd -" tells whether instance is up and for how long

db2pd –

Database Partition 0 -- Active -- Up 4 days 05:34:53

“db2pd -db <database> -” tells how long the database has been active

db2pd -db sample –

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 19:11:07

db2pd -alldbs –

Database Partition 0 -- Database XMLDB -- Active -- Up 0 days 19:11:25

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 19:11:43

Page 4: LUW 4 DAMA-UPC IBM Db2pd Monitoring

4

Determine whether database is up (cont‟d)

Attempting to take an offline backup of an activated database

db2 backup db sample to /home/db2inst1

SQL1035N The database is currently in use. SQLSTATE=57019

db2diag.log will contain

2006-01-04-02.04.17.162649-300 I22772A342 LEVEL: Error

PID : 6275140 TID : 1 PROC : db2bp

INSTANCE: db2inst1 NODE : 000

FUNCTION: DB2 UDB, database utilities, sqlubConnectDatabase, probe:1259

DATA #1 : Hexdump, 4 bytes

0x0FFFFFFFFFFFF490 : FFFF FBF5

db2 list applications command will not work

– it only tells you whether or not any applications are active on the database

Page 5: LUW 4 DAMA-UPC IBM Db2pd Monitoring

5

Monitoring progress and behavior of DB2 agents

db2pd -agents [db=<database>] [ [agent=<agentid>] | [application=<appid>] ]

Agents:

Current agents: 8

Idle agents: 5

Active agents: 2

Coordinator agents: 2

Address AppHandl [nod-index] AgentPid Priority Type

State ClientPid Userid ClientNm Rowsread Rowswrtn LkTmOt

DBName

0x0780000001A35540 0 [000-00000] 2809968 0

Idle n/a n/a n/a 0 0 NotSet

n/a

0x0780000001A16A00 1406 [000-01406] 1134686 0 Coord

Inst-Active 598024 dabrashk db2bp 2 0 NotSet

SAMPLE

0x0780000001A17460 581 [000-00581] 884778 0 Coord

Inst-Active 2969776 dabrashk db2bp 14 0 NotSet

SAMPLE

0x0780000001A34AE0 0 [000-00000] 2076776 0 Coord

Pooled n/a n/a n/a 0 0 NotSet

SAMPLE

Page 6: LUW 4 DAMA-UPC IBM Db2pd Monitoring

6

SQL1226N due to too many db2agent processes

If many db2agent processes remain attached to an instance

– They may use up all of the agents (MAXAGENTS)

– New connections will trigger SQL1226N

– “The maximum number of client connections are already started”

– ADM7009E error message will be logged in a notify log

Use “db2pd –agents” to find ApplHandl, ClientPid, UserId and ClientNm

db2pd -agents | awk '/Address|^0x/ { print $2, $8, $9, $10}'

AppHandl ClientPid Userid ClientNm

643 3612794 dabrashk db2bp

642 3227856 dabrashk db2bp

Force the identified AppHandl off with command

db2 "force application (642)"

Page 7: LUW 4 DAMA-UPC IBM Db2pd Monitoring

7

Repeat Option: db2pd –rep [num sec] [count]

db2pd -age -rep 10 3

Database Partition 0 -- Active -- Up 0 days 00:10:42

AppHandl AgentPid ClientPid Userid ClientNm Rowsread Rowswrtn DBName120 1605826 1130588 jmcmahon db2bp 18943 23998 SAMPLE13 1085632 1675376 jmcmahon db2bp 3489 9898 SAMPLE

AppHandl AgentPid ClientPid Userid ClientNm Rowsread Rowswrtn DBName120 1605826 1130588 jmcmahon db2bp 98941 100091 SAMPLE13 1085632 1675376 jmcmahon db2bp 4100 9898 SAMPLE

AppHandl AgentPid ClientPid Userid ClientNm Rowsread Rowswrtn DBName120 1605826 1130588 jmcmahon db2bp 100999 230448 SAMPLE13 1085632 1675376 jmcmahon db2bp 6777 9898 SAMPLE

Repeat option is handy for watching activities. Example above watches agent’s reads and writes

Combine -repeat option with the file redirection

– db2pd -age file=agents.out -rep 10 3

Combine multiple options and use mixed scope options– db2pd –db sample –loc –tra –age –fil lock.txt

– Use –file for multiple options

Page 8: LUW 4 DAMA-UPC IBM Db2pd Monitoring

8

Monitoring progress and behavior of applications

db2pd –applications –db sample

Use to map application to a coordinator agent

Use to determine a status of application

Use to map application to the dynamic SQL statement

db2pd -activestatements

– Any statement that is part of the active statement list is reported

– Gives the user the ability to identify all active dynamic statements for all

applications

Use to map application ID to IP address and port

– Simplified in v9.1: no need to convert

Page 9: LUW 4 DAMA-UPC IBM Db2pd Monitoring

9

Monitoring currently executing dynamic SQL statements

db2pd -db sample -activestatements

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:43:12

Active Statement List:

Address AppHandl [nod-index] UOW-ID StmtID AnchID StmtUID EffISO EffLockTOut EffDegree StartTime LastRefTime

0x0780000020A405A0 51 [000-00051] 7 1 73 1 1 -2 0 Sat Jan 7 00:17:43 2006 Sat Jan 7 00:17:43 2006

0x07800000209FB8A0 44 [000-00044] 1 2 44 1 1 -2 0 Sat Jan 7 00:10:12 2006 Sat Jan 7 00:10:12 2006

db2pd -db sample -dyn

Dynamic SQL Statements:

Address AnchID StmtUID NumEnv NumVar NumRef NumExe Text

0x0780000020A08660 43 1 2 2 4 3 create table staff3 like staff

0x0780000020A076C0 44 1 1 1 1 1 select name from staff

Monitoring SQL statements executed on the instance– sqltext script

– Uses insert time as reported by „db2pd –dynamic‟ SQL Variations section.

Page 10: LUW 4 DAMA-UPC IBM Db2pd Monitoring

10

Monitoring transactions: db2pd -transactions

Transactions:Address AppHandl [nod-index] TranHdl Locks State Tflag

Tflag2 Firstlsn Lastlsn LogSpace SpaceReserved TID AxRegCnt GXID

0x078000002024DA80 1000 [000-01000] 2 1020 READ 0x00000000 0x00000000 0x000177000C00 0x0000017FF000 1230 10000 0x000000006614 1 n/a

0x078000002024E780 1004 [000-01004] 3 35 WRITE 0x00000000 0x00000000 0x0001801CF000 0x000001999600 115 10000 0x000000006627 1 n/a

Useful for determining the amount of resources a transaction is using

– db2pd -transactions provides

– number of locks

– first lsn, last lsn– Log Sequence Number represents relative byte address, within the database log, for the first

byte of the log record

– logspace used (in pages)

– space reserved (in pages)

Monitor the progress and behavior of any transaction

Page 11: LUW 4 DAMA-UPC IBM Db2pd Monitoring

11

Monitoring application progress

Identifying slow or hanging aplications

Monitor rows read and written for agents

– db2pd -agents | awk '/Address|^0x/ { print $2, $11, $12, $14;}„

AppHandl Rowsread Rowswrtn DBName

51 109 58 SAMPLE

44 46 0 SAMPLE

9 0 0 n/a

8 126 107 SAMPLE

0 0 0 SAMPLE

– Use AppHandl to determine the application to take action on

– Use db2pd –dynamic to find SQL statement

– Use db2pd –static to find package

– Use AppId to find out application‟s IP address and port number (if applicable)

Page 12: LUW 4 DAMA-UPC IBM Db2pd Monitoring

12

Monitoring application progress (cont‟d)

Monitor start time of statements in db2pd –activestatements– db2pd -activestat -db sample | awk '/Address/ { print $2,$6,$7,$11,$12 } /^0x/ {print

$2,$6,$7,substr($0,115);}'

AppHandl AnchID StmtUID StartTime LastRefTime

76 44 1 Sat Jan 7 17:57:35 2006 Sat Jan 7 17:57:35 2006

– Use AnchID and StmtUID to identify SQL statement

– Use AppHandl to identify application

Find “non-committing” transactions

– Use the first and last LSN (log sequence number) of the transaction

• db2flsn executable can be used to identify the log file for a specific lsn.

– db2pd -trans -db sample | awk '/Address|^0x/ { print $2,$9,$10;}'

AppHandl Firstlsn Lastlsn

76 0x0001801CF000 0x000202999600

44 0x000177000C00 0x000177000C00

Find biggest lockers

– Use AppHandl and Locks fields of db2pd -transactions

– db2pd -trans -db sample | awk '/^0x/ { print $5, $2}' | sort -rn

7 76

4 74

2 44

Page 13: LUW 4 DAMA-UPC IBM Db2pd Monitoring

13

Monitoring OS: db2pd -osinfo

Operating System Information:

OSName: AIX

NodeName: mymachine

Version: 5

Release: 2

Machine: 000ABCD123

CPU Information:

TotalCPU OnlineCPU ConfigCPU Speed(MHz) HMTDegree

4 4 4 1453 1

Physical Memory and Swap (Megabytes):

TotalMem FreeMem AvailMem TotalSwap FreeSwap

16384 10201 n/a 16384 16366

Virtual Memory (Megabytes):

Total Reserved Available Free

40960 n/a n/a 38907

Shared Memory Information:

ShmMax ShmMin ShmIds ShmSeg

68719476736 1 131072 0

Page 14: LUW 4 DAMA-UPC IBM Db2pd Monitoring

14

Using db2pd to diagnose hangs

Page 15: LUW 4 DAMA-UPC IBM Db2pd Monitoring

15

Useful DB2 tools for hangs

The “db2pd” tool

– Purpose: To gather information quickly and non-intrusively

from the DB2 engine

The “db2cos” tool

– Purpose: To be called “inline” from DB2 code to collect

information about problems

Latch tracking

– Purpose: To track latch ownership

Snapshot monitoring

– Purpose: To understand point in time status of queries or

entities in the DB2 engine

DB2 trace

– To investigate possible movement in DB2 agents

Page 16: LUW 4 DAMA-UPC IBM Db2pd Monitoring

16

Troubleshooting deadlocks and lock wait timeouts

For transient/infrequent deadlocks or lock wait timeouts

– These can last a short period of time and need to be caught quickly

– db2pdcfg -catch uses inline code to trigger an action immediately

– Default (primary) action is to run the DB2 call out script (db2cos)

db2pdcfg -catch locktimeout count=1Error Catch #1

Sqlcode: 0

ReasonCode: 0

ZRC: -2146435004

ECF: 0

Component ID: 0

LockName: Not Set

LockType: Not Set

Current Count: 0

Max Count: 1

Bitmap: 0x4A1

Action: Error code catch flag enabled

Action: Execute sqllib/db2cos callout script

Action: Produce stack trace in db2diag.log

Page 17: LUW 4 DAMA-UPC IBM Db2pd Monitoring

17

Collecting call stacks in v9.1

Used to determine what DB2 agent is doing

Call stack collection has been sped up significantly in v9.1

– Stack traces (trap files) collection is dissociated from binary dump files

Produce stack trace for all PIDs or chosen PID

– db2pd -stack [all|<pid>]

– Produces trap file(s) in the DIAGPATH directory

Produce dump file and stack trace for all PIDs or chosen PID

– db2pd -dump [all|<pid>]

Page 18: LUW 4 DAMA-UPC IBM Db2pd Monitoring

18

db2pd -latches option (v9.1)

Latch tracking is always-on in v9.1

db2pd -latches

Latches:

Address Holder Waiter Filename LOC LatchType

0x07800000203C40C8 1695804 0 sqldpool.C 529

SQLO_LT_SQLB_CLNR_PAUSE_CB__preventSuspendLatch

0x07800000203C4190 1695804 0 sqlbpool.C 2254

SQLO_LT_SQLB_PTBL__pool_table_latch

0x07800000203C5678 1695804 0 sqlbistorage.h 5169

SQLO_LT_SQLB_PTBL__ptfLatches

– Holder is agent process ID

– LatchType is a latch identifier

To group latches by holders and waiters

– db2pd -latches group

Latch Holders:

Address Holder Filename LOC LatchType

0x07800000204A8A48 1695804 sqlbilatch.C 1150 SQLO_LT_SQLB_POOL_CB__writeLatch

Latch Waiters:

Address Waiter Filename LOC LatchType

Page 19: LUW 4 DAMA-UPC IBM Db2pd Monitoring

19

Detecting hangs for specific applications Take a few application snapshots a minute apart to determine

– what the status of the application is (db2pd –app: Status) and

– whether any work is being done (db2pd –agent: RowsRead/Wrtn)

– It is useful to have turned on all of the monitor switches prior to a re-creatable or recurring problem scenario

Use db2pd -app to determine status of an application– db2pd -app -db sample | awk '/Address|^0x/ { print $6 }'

If status is UOW Waiting, the hang is not occurring at the DB2 server– The client application should be investigated to find out what it is waiting for.

– In DPF environment, this may indicate a problem with another partition

If status is Executing and counters like rows-read/written are increasing, it is likely a performance issue

If status is Lock-wait than it is a locking/concurrency issue– Exception is the case when the application being waited on is in UOW Executing and making

no progress

If status is Executing yet no counters are increasing, then the agent or agents servicing the application may be in an abnormal state

– More diagnostics is needed

Page 20: LUW 4 DAMA-UPC IBM Db2pd Monitoring

20

Monitoring locks: lock contention

Monitor for slowdowns– Important to figure out “who is waiting for whom”

Identify the lock owner to consider action of releasing the lock

db2pd -db sample -loc

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:00:26

Locks:... TranHdl Lockname Type Mode Sts Owner ...... 2 00020003000000040000000052 Row ..X G 2... 3 00020003000000040000000052 Row ..X W 2

Look for the “W” status for waiters– Transaction 2 is holding the lock that transaction 3 is waiting on

db2pd -db sample -locks wait

– Show locks with a wait status and their waiter (v8.2 FP9)

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 01:17:17

Locks:... TranHdl Lockname Type Mode Sts Owner Dur HldCnt Att Rlse... 2 000200040000000D0000000052 Row .NS W 3 1 0 0 0x0... 4 00020003000000270000000052 Row .NS W 0 1 0 0 0x0... 2 00020003000000270000000052 Row ..X G 2 1 0 8 0x40

Look for the “G” status for holders

Page 21: LUW 4 DAMA-UPC IBM Db2pd Monitoring

21

Monitoring locks (cont‟d) Lockname <==> hex representation of the physical object that is being waited on

db2pd -db sample -loc showlocks

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:10:42

Locks:... Lockname Type... 000200030000001A0000000052 Row TbspaceID 2 TableID 3 RecordID 0x1A... 000200030000001F0000000052 Row TbspaceID 2 TableID 3 RecordID 0x1F... 53514C4332453036C8324ABC41 Internal P Pkg UniqueID 53514c43 32453036 Name c8324abc... 53514C4332453036C8324ABC41 Internal P Pkg UniqueID 53514c43 32453036 Name c8324abc... 0002000D000000050000000052 Row TbspaceID 2 TableID 13 RecordID 0x5... 00020003000000000000000054 Table TbspaceID 2 TableID 3... 0002000D000000000000000054 Table TbspaceID 2 TableID 13

„showlocks‟ suboption will expand the lockname into meaningful explanations

To determine who is holding a lock in your database

– db2pd –database sample –locks –transactions –agents –file lock.txt

– -agents will contain UserID for transaction handle that is holding a lock (status is G (granted))

Map lock info to a table name– Use TableID from Lockname or showlocks output

db2pd -tcbstats -db sample | awk '/Address|^0x/ { print $2,$3,$4}'

TbspaceID TableID TableName

0 18 SYSROUTINES

0 81 SYSROUTINEPROPERTI

2 4 DEPARTMENT

Page 22: LUW 4 DAMA-UPC IBM Db2pd Monitoring

22

Monitoring buffer pool

Determine whether we are spending time flushing buffers

– due to space constraint or poor allocation of pools

– It‟s needed to identify areas for tuning

db2pd -buffer -db sample

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days

00:00:41

Bufferpools:

First Active Pool ID 1h

Max Bufferpool ID 1

Max Bufferpool ID on Disk 1

Num Bufferpools 5

Address Id Name PageSz PA-NumPgs BA-NumPgs

BlkSize NumTbsp PgsToRemov CurrentSz PostAlter SuspndTSCt

0x078000002034A8C0 1 IBMDEFAULTBP 4096 1000 0 0

3 0 1000 1000 0

Page 23: LUW 4 DAMA-UPC IBM Db2pd Monitoring

23

Buffer pool statistics (v9.1)

db2pd -db sample -bufferpools– DatLRds (DatPRds) number of logical (physical) data page reads for this

bufferpool

– Hit ratio for data pages given the above logical and physical reads

– Same for index pages: IdxLRds, IdxPRds, HitRatio

Bufferpool Statistics for all bufferpools (when BUFFERPOOL monitor switch is ON):

BPID DatLRds DatPRds HitRatio TmpDatLRds TmpDatPRds HitRatio IdxLRds IdxPRds HitRatio TmpIdxLRds TmpIdxPRds HitRatio

1 78 22 71.79% 0 0 00.00% 100 58 42.00% 0 0 00.00%

BPID DataWrts IdxWrts DirRds DirRdReqs DirRdTime DirWrts DirWrtReqs DirWrtTime1 0 0 42 5 0 0 0 0

BPID AsDatRds AsDatRdReq AsIdxRds AsIdxRdReq AsRdTime AsDatWrts AsIdxWrts AsWrtTime 1 0 0 0 0 0 0 0 0

BPID TotRdTime TotWrtTime VectIORds VectIOReq BlockIORds BlockIOReq PhyPgMaps FilesClose NoVictAvl UnRdPFetch

1 104 0 0 0 0 0 0 8 0 0

Page 24: LUW 4 DAMA-UPC IBM Db2pd Monitoring

24

Buffer pool statistics (v9.1) (cont)

Filtering bufferpools output by bufferpool ID

– -bufferpools <bpID>

db2pd -db sample -bufferpools 4099

Address Id Name PageSz PA-NumPgs BA-NumPgs

BlkSize NumTbsp PgsToRemov CurrentSz PostAlter SuspndTSCt

0x07800000203C99A0 4099 IBMSYSTEMBP32K 32768 16 0 0

0 0 16 16 0

Bufferpool Statistics for bufferpool 4099 (when BUFFERPOOL monitor switch is

ON):

Page 25: LUW 4 DAMA-UPC IBM Db2pd Monitoring

25

db2pd -pages (v9.1)

db2pd -db sample -pages

– Pages for all bufferpools

db2pd -db sample -pages [<bpID>]

Monitor bufferpool behavior

– Tells which pages are in the bufferpool

– Use to determine what is in the bufferpool that is the cause for the hit ratio to be lower than you expect

Allows user to check how many pages each object (table, index, etc) has within any particular bufferpool

Similar to IDS's onstat -b option

might help to detect a problem, such as an insufficient number of buffers in the buffer pool or high “read aheads”

Page 26: LUW 4 DAMA-UPC IBM Db2pd Monitoring

26

db2pd -pages: example (v9.1)

db2pd -db sample -page

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:01:01

Bufferpool Pages:First Active Pool ID 1Max Bufferpool ID 1Max Bufferpool ID on Disk 1Num Bufferpools 5

Pages for all bufferpools:Address BPID TbspaceID TbspacePgNum ObjID ObjPgNum ObjClass ObjType Dirty

Prefetched0x07800000204DD040 1 0 6 19 6 Perm Index N N 0x07800000204DD0F0 1 0 0 19 0 Perm LOBA N N 0x07800000204DD1A0 1 0 0 19 0 Perm Index N N 0x07800000204DD250 1 0 1 19 1 Perm Index N N 0x07800000204DD300 1 0 2 19 2 Perm Index N N 0x07800000204DD3B0 1 0 0 1 0 Perm Data N N 0x07800000204DD460 1 0 4 19 4 Perm Index N N 0x07800000204DD510 1 0 0 19 0 Perm Data N N 0x07800000204DDB40 1 0 1 87 1 Perm Index N N 0x07800000204DDBF0 1 0 7 19 7 Perm Index N N 0x07800000204DDCA0 1 0 8 19 8 Perm Index N N 0x07800000204DDD50 1 0 9 19 9 Perm Index N N 0x07800000204DDE00 1 0 10 19 10 Perm Index N N <snip>Total number of pages: 80

Summary info for all bufferpools:…

Page 27: LUW 4 DAMA-UPC IBM Db2pd Monitoring

27

Monitoring for SQL errors: db2pdcfg -catch

Functionality moved from db2pd to db2pdcfg in v9.1

Purpose

– allow the user to catch any sqlcode (and reason code), zrc or ecf codes (internal error codes)

– capture the information needed to solve the error code

Primary action: execute the db2cos (callout script)

– template db2cos file is located in sqllib/bin (v9.1)

– db2cos may be altered to run any command (db2pd,OS or other) needed to solve the problem: default is „db2pd -db $database‟ in "SQLCODE“ section)

Defaults

– Up to 10 catch points simultaneously

– “Error catch array is full. Use 'clear' suboption to clear an element"

– Up to 255 invocations

– Max Count: 255

Page 28: LUW 4 DAMA-UPC IBM Db2pd Monitoring

28

Index Statistics: db2pd -tcbstats all

Useful for performance tuning

Numerous statistics are reported to provide characteristics about each

index‟s use

– Only database activation/deactivation will reset these statistics

“db2pd –tcbstats all” or “db2pd –tcbstats index”

TCB Index Stats:

TbspaceID TableID TableName SchemaNm ID RootSplits Scans KeyUpdates Merg

0 2 SYSTABLES SYSIBM 6 0 0 0 0

0 2 SYSTABLES SYSIBM 5 0 0 0 0

Scans

– The number of scans against the index

Page 29: LUW 4 DAMA-UPC IBM Db2pd Monitoring

29

Detect full-table scan vs index scans

Detect full-table scans for every table

– Use „db2pd -tcb -db <dbname>‟

db2 “select * from employee”

db2pd -tcb -db sample | awk '/TCB Table Stats/ { found =1} found==1 { print}' | grep -i employee | awk '{print "Scans: ", $3}'

Detect number of index scans

– Use 'TCB Index Stats' portion of the 'db2pd -tcb index -db <dbname>' output

db2 "CREATE INDEX LNAME ON EMPLOYEE (LASTNAME ASC)“

db2 “select * from employee”

db2pd -tcb index -db sample | awk '/TCB Index Stats/ { found =1} found==1 { print}' | grep -i employee | awk '$8 > 0 {print "Index Scans: ", $8}'

Page 30: LUW 4 DAMA-UPC IBM Db2pd Monitoring

30

DB2 table access ratio

db2pd can help determine the access frequency of each table

– Operations such as select, update, insert and delete

– Reported by db2pd -db <dbname> -tcbstats

Identify tables with most inserts done to them

db2pd -db sample -tcbstats | awk '/TCB Table Stats/ { found =1} found==1 { print}' |

awk '/^0x/ { print $9, $2}' | sort -rn | head -5

36 STAFF3

32 EMPLOYEE

20 PROJECT

7 SYSCOLUMNS

1 SYSUSERAUTH

Page 31: LUW 4 DAMA-UPC IBM Db2pd Monitoring

32

Monitoring progress of transaction logging

By watching the Pages Written output, you can determine whether the log usage is progressing

db2pd -logs -db sample

Logs:Current Log Number 4Pages Written 464

Address StartLSN State Size Pages Filename0x000000022022FEB8 0x000000FA0000 0x00000000 1000 597 S0000000.LOG

0x000000022022FF78 0x000001388000 0x00000000 1000 5 S0000001.LOG

0x0000000220008E78 0x000001770000 0x00000000 1000 3 S0000002.LOG

0x0000000220A57F58 0x000001B58000 0x00000000 1000 1000 S0000003.LOG

0x0000000220A32598 0x000001F40000 0x00000000 1000 1000 S0000004.LOG

Monitor amount of log space consumed over the course of 10 minutes– db2pd -db SAMPLE -logs -repeat 60 10

Page 32: LUW 4 DAMA-UPC IBM Db2pd Monitoring

33

Monitoring log usage (FP9 enhancements)

db2pd -logs has some new information since v8.2.2:Logs:Current Log Number 5Pages Written 846Method 1 Archive Status SuccessMethod 1 Next Log to Archive 5Method 1 First Failure n/aMethod 2 Archive Status SuccessMethod 2 Next Log to Archive 5Method 2 First Failure n/a

Address StartLSN State Size Pages Filename0x000000023001BF58 0x000001B58000 0x00000000 1000 1000 S0000002.LOG0x000000023001BE98 0x000001F40000 0x00000000 1000 1000 S0000003.LOG0x0000000230008F58 0x000002328000 0x00000000 1000 1000 S0000004.LOG

Two problems can be identified with this output– Problem with archiving

– if Archive Status is set to Failure, the most recent log archive failed

– If First Failure is set, ongoing archive failure is preventing logs from archiving

– Log archiving is proceeding very slowly

– Next Log to Archive will be behind Current Log Number (this can cause the log path to fill up completely)

– Monitor „Next Log to Archive‟ compared to „Current Log Number‟

– If next log is 3 and current is 5, then logs 3 and 4 haven‟t been logged yet

– Log 5 is the current log being written into

Page 33: LUW 4 DAMA-UPC IBM Db2pd Monitoring

34

Monitoring Tablespaces and Containers

Single tablespace can be monitored

– db2pd -tab[lespaces] <tablespaceID> -rep[eat] <numSecs>

db2 "create tablespace dms1 managed by database using (file 'tbspace1' 1M)“

db2pd -db sample -tab | grep DMS1 | awk '{print $2, $15}„6 DMS1 tablespace ID is 6

db2pd -db sample -tab 6 | perl -ane 'if (/TotalPgs * UsablePgs/ .. /^$/) { print "$F[2] $F[3]\n" } '

UsablePgs UsedPgs

224 96

db2 "insert into staff3 select * from staff“…repeat …

db2pd -db sample -tab 6 | perl -ane 'if (/TotalPgs * UsablePgs/ .. /^$/) { print "$F[2] $F[3]\n" } '

UsablePgs UsedPgs

224 160

Page 34: LUW 4 DAMA-UPC IBM Db2pd Monitoring

35

Verifying isolation level

Current isolation level of dynamic SQL statement

– “how can I tell what isolation level is being used ?”

– db2pd -db sample -dynamic

– ISO column in Dynamic SQL Environments sectionDynamic SQL Environments:

Address AnchID StmtUID EnvID Iso QOpt Blk

0x0780000020CEBC40 41 1 1 CS 5 B

0x0780000020BB5D80 42 2 1 CS 5 B

0x0780000020CE2FC0 235 1 1 RR 5 B

– db2pd -activestatements

– EffISO column

– 0=RR,1=CS,2=UR and 3=RS

“Will setting DB2_EVALUNCOMMITTED to ON help?”

– DB2_EVALUNCOMMITTED enables deferred locking

– available if application is using cursor stability or read stability

Page 35: LUW 4 DAMA-UPC IBM Db2pd Monitoring

36

Monitoring memory usage

db2pd -memsets -mempools

reports statistics about DB2 Memory Sets and Memory Pools which helps in

understanding memory usage

Memory Sets:

Name Address Id Size Key DBP Type Ov OvSize

DBMS 0x0780000000000000 19398699 56639488 0x59FE1161 0 0 Y 8241152

FMP 0x0780000010000000 72220781 245284864 0x0 0 2 N 0

Trace 0x0770000000000000 28704896 134906824 0x59FE1174 0 -1 N 0

Memory Pools:

Address MemSet PoolName Id Overhead LogSz LogUpBnd LogHWM

PhySz PhyUpBnd PhyHWM Bnd BlkCnt CfgParm

0x0780000000001230 DBMS fcmrqb 79 193888 1290312 1953864 1290312

1507328 1966080 1507328 Ovf 0 n/a

0x07800000000003C0 DBMS eduah 72 2496 112032 112064 112032

114688 114688 114688 Ovf 0 n/a

0x07800000100003C0 FMP undefh 59 48000 737400 245163840 737400

786432 245170176 786432 Phy 0 n/a

Page 36: LUW 4 DAMA-UPC IBM Db2pd Monitoring

37

Monitoring memory usage (cont)

db2pd –memblocks

– Reports all memory blocks in DBMS set (list)

– Followed by the sorted 'per-pool' output Memory blocks sorted by size for ostrack pool:

PoolID PoolName TotalSize(Bytes) TotalCount LOC File

57 ostrack 5160048 1 3047 698130716

57 ostrack 240048 1 3034 698130716

57 ostrack 240 1 2983 698130716

57 ostrack 80 1 2999 698130716

57 ostrack 80 1 2970 698130716

57 ostrack 80 1 3009 698130716

Total size for ostrack pool: 5400576 bytes

– Final section sorts the consumers of memory for the entire setAll memory consumers in DBMS memory set:

PoolID PoolName TotalSize(Bytes) %Bytes TotalCount %Count LOC File

57 ostrack 5160048 71.90 1 0.07 3047 698130716

50 sqlch 778496 10.85 1 0.07 202 2576467555

50 sqlch 271784 3.79 1 0.07 260 2576467555

57 ostrack 240048 3.34 1 0.07 3034 698130716

50 sqlch 144464 2.01 1 0.07 217 2576467555

69 krcbh 73640 1.03 5 0.36 547 4210081592

Report memory blocks for private memory on UNIX and Linux

– db2pd -memb pid=159770

Page 37: LUW 4 DAMA-UPC IBM Db2pd Monitoring

39

Monitoring utilities: db2pd -utilities

Utilities:

ID Type DBName StartTime NumPhase CurPhase Desc

2 BACKUP SAMPLE Wed 12:35:00 Apr 28 2004 1 1 offlinedb

Progress:

ID PhaseNum StartTime CompletedWork TotalWork

2 1 Wed Apr 28 12:35:42 2004 22782661 bytes 24303325 bytes

Utility types: BACKUP, RUNSTATS, REORG, RESTORE, CRASH_RECOVERY,

ROLLFORWARD_RECOVERY, LOAD, RESTART_RECREATE_INDEX

– db2 backup db sample

– db2 restore db sample

Use to monitor utilities‟ progress

– Determine whether to throttle or unthrottle BACKUP or RUNSTATS utility (using SET

UTIL_IMPACT_PRIORITY)

Page 38: LUW 4 DAMA-UPC IBM Db2pd Monitoring

40

Monitoring Table Reorgs: db2pd -reorgs

Table reorg output including tablespace id, table id, table name, phases, counters, type (offline/online), start time, and end time are reported

Table Reorg Stats:

TbspaceID TableID TableName MaxPhase Phase CurCount MaxCount Type

2 2 PDTEST 2 Replace 0 2 Offline

Phase field (only applies to offline table reorganization)

– The phase of the table reorganization: Sort, Build, Replace, InxRecreat

Status field (only applies to online table reorganization)

– status of an online table reorganization: Started, Paused, Stopped, Done, Truncat

– “Done" status indicates that the reorg utility has been completed

Completion field– success indicator for the table reorganization. Possible values:

– 0. The table reorganization completed successfully

– -1. The table reorganization failed

Page 39: LUW 4 DAMA-UPC IBM Db2pd Monitoring

41

Recovery: db2pd -recovery

Recovery:Recovery Status 0x00000401Current Log S0000000.LOGCurrent LSN 000000BB800CJob Type ROLLFORWARD RECOVERYJob ID 2Job Description Database Rollforward RecoveryInvoker Type User Total Phases 2 Current Phase 1

Progress:PhaseNum Description StartTime CompletedWork TotalWork1 Forward Wed May 5 10:48:09 2004 0 bytes Unknown2 Backward NotStarted 0 bytes Unknown

Monitoring recovery

db2pd –recovery shows several counters to make sure recovery is progressing:

– Current Log and Current LSN provide the log position

– CompletedWork counts the number of bytes completed thus far

Page 40: LUW 4 DAMA-UPC IBM Db2pd Monitoring

42

Figuring out which application is using up your tablespace Identify number of Inserts for table (here, temp table TEMP1)

– db2pd -tcbstats

TCB Table Stats:

Address TbspaceID TableID TableName SchemaNm Scans Inserts

ObjClass UDI DataSize

0x000000022094AA58 4 2 TEMP1 SESSION 0 124

Temp 0 1

Map to tablespace 4 in db2pd -tablespaces output:

Tablespaces:

Address Id Type Content AS AR PageSize ExtentSize Auto

Prefetch BufID BufIDDisk

0x0000000220942F80 4 DMS UsrTmp No No 4096 32 Yes 32

1 1

Containers:

Address TspId ContainNum Type TotalPages UseablePgs StripeSet

Container

0x0000000220377CE0 4 0 File 10000 9952 0

/export/home/jmcmahon/tempspace2a

Notice the space filling up by watching UseablePgs vs. TotalPages

Page 41: LUW 4 DAMA-UPC IBM Db2pd Monitoring

43

Figuring out which application is using up your tablespace (2)

Identify the dynamic sql statement using a table called TEMP1

– db2pd -db sample -dyn

Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:13:06

Dynamic Cache:

Current Memory Used 1072198

Total Heap Size 1271398

Cache Overflow Flag 0

Number of References 7540

Number of Statement Inserts 3981

Number of Statement Deletes 3924

Number of Variation Inserts 2459

Number of Statements 57

Dynamic SQL Statements:

Address AnchID StmtUID NumEnv NumVar NumRef NumExe

Text

0x0000000220A08C40 78 1 2 2 3 2

declare global temporary table temp1 (c1 char(6)) not logged

0x0000000220A8D960 253 1 1 1 24 24

insert into session.temp1 values(TEST)

Page 42: LUW 4 DAMA-UPC IBM Db2pd Monitoring

44

Figuring out which application is using up your tablespace (3) Map this to -app output to identify the application

– db2pd -app -db sample

Applications:

Address AppHandl [nod-index] NumAgents CoorPid Status C-

AnchID C-StmtUID L-AnchID L-StmtUID Appid

0x0000000200661840 501 [000-00501] 1 11246 UOW-Waiting 0

0 253 1 *LOCAL.jmcmahon.050202160426

db2pd -agent output will show the number of rows written as

verification

Address AppHandl [nod-index] AgentPid Priority Type

State ClientPid Userid ClientNm Rowsread Rowswrtn LkTmOt

DBName

0x0000000200698080 501 [000-00501] 11246 0 Coord

Inst-Active 26377 jmcmahon db2bp 100999 230448 NotSet

SAMPLE

Page 43: LUW 4 DAMA-UPC IBM Db2pd Monitoring

45

Monitoring implicit temporary table space

Steps are different for the implicit temporary table

Use db2pd -tcbstats to identify tables with large numbers of insertsTCB Table Information:

Address TbspaceID TableID PartID MasterTbs MasterTab TableName SchemaNm ObjClass

DataSize ...

0x0780000020CC0D30 1 2 n/a 1 2 TEMP (00001,00002) <30> <JMC Temp

2470 ...

0x0780000020CC14B0 1 3 n/a 1 3 TEMP (00001,00003) <31> <JMC Temp

2367 ...

0x0780000020CC21B0 1 4 n/a 1 4 TEMP (00001,00004) <30> <JMC Temp

1872 ...

TCB Table Stats:

Address TableName Scans UDI PgReorgs NoChgUpdts Reads FscrUpdates Inserts ...

0x0780000020CC0D30 TEMP (00001,00002) 0 0 0 0 0 0 43219 ...

0x0780000020CC14B0 TEMP (00001,00003) 0 0 0 0 0 0 42485 ...

0x0780000020CC21B0 TEMP (00001,00004) 0 0 0 0 0 0 0 ...

Notice large number of inserts for implicit temporary tables

– tables with the naming convention "TEMP (TbspaceID, TableID)“

– Identify the application doing the work

– values in the SchemaNm column have a naming convention of

<AppHandl><SchemaNm>

Page 44: LUW 4 DAMA-UPC IBM Db2pd Monitoring

46

Monitoring implicit temporary table space (cont) Map that info to the used space for table space 1

– Use db2pd –tablespaces

– Notice the UsedPgs vs the UsablePgs in the table space statisticsTablespace Configuration:

Address Id Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC

NumCntrs MaxStripe LastConsecPg Name

0x07800000203FB5A0 1 SMS SysTmp 4096 32 Yes 320 1 1 On

10 0 31 TEMPSPACE1

Tablespace Statistics:

Address Id TotalPgs UsablePgs UsedPgs PndFreePgs FreePgs HWM State

MinRecTime NQuiescers

0x07800000203FB5A0 1 6516 6516 6516 0 0 0 0x00000000

0 0

Tablespace Autoresize Statistics:

Address Id AS AR InitSize IncSize IIP MaxSize LastResize LRF

0x07800000203FB5A0 1 No No 0 0 No 0 None No

Identify the application handles 30 and 31

– Seen in the -tcbstats output

– db2pd -app

Map this to the Dynamic SQL using db2pd -dyn

Page 45: LUW 4 DAMA-UPC IBM Db2pd Monitoring

47

db2pd –fmp command

Returns information about the process in which the fenced routines are executed

db2pd –fmp

FMP:Pool Size: 1Max Pool Size: 250Keep FMP: YESInitialized: YESTrusted Path: /home/dabrashk/sqllib/function/unfencedFenced User: nobody

FMP Process:Address FmpPid Bit Flags ActiveThrd PooledThrd Active0x07800000007A5FE0 6455492 64 0x00000000 0 0 No

Active Threads:Address FmpPid EduPid ThreadId No active threads.

Pooled Threads:Address FmpPid ThreadId No pooled threads.

Page 46: LUW 4 DAMA-UPC IBM Db2pd Monitoring

48

db2pd -fmp (cont‟d)

Useful flags (StateFlags field)

– 0x00000001 - JVM initialized

– 0x00000002 - Is threaded

– 0x00000004 - Used to run federated wrappers

– 0x00000008 - Used for Health Monitor

– 0x00000010 - Marked for shutdown and will not accept new tasks

– 0x00000020 - Marked for cleanup by db2sysc

– 0x00000040 - Marked for agent cleanup

– 0x00000100 - All ipcs for the process have been removed

– 0x00000200 - .NET runtime initialized

– 0x00000400 - JVM initialized for debugging

– 0x00000800 - Termination flag

ActiveTh - Number of active threads running in the fmp process.

PooledTh - Number of pooled threads held by the fmp process.

Active - Active state of the fmp process. Values are Yes or No.

Page 47: LUW 4 DAMA-UPC IBM Db2pd Monitoring

49

-fcm command improvements in v9.1

Use the new hwm option to see historical information about

applications that consume large amounts of fast communication

manager (FCM) resources

– Retrieve high-watermark consumptions of FCM buffers and channels by

applications since the start of the DB2 instance

– The high-watermark consumption values of applications are retained even

if they have disconnected from the database already

The output will now contain FCM channel usage statistics, including

the high and low water mark values with respect to the number of

channels used.

Page 48: LUW 4 DAMA-UPC IBM Db2pd Monitoring

50

DB2 Troubleshooting and Problem Determination

Resources

– DB2 Monitoring and Troubleshooting: db2pd tool

– http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.a

dmin.doc/doc/r0011729.htm

– Problem Determination Guide

– http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.r

n.doc/doc/c0023244.htm

– What’s new for V9.1: Troubleshooting and problem determination enhancements

– http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.r

n.doc/doc/c0023244.htm

– DB2 Product Support site

– http://www-306.ibm.com/software/data/db2/udb/support/index.html

– DB2 APARs (Authorized Program Analysis Reports)

– http://www-306.ibm.com/software/data/db2/udb/support/apars.html

Page 49: LUW 4 DAMA-UPC IBM Db2pd Monitoring

51

Summary

Understanding requirements for PD and monitoring

Different monitoring scenarios

Tools available in v8.2 and v9.1

– db2pd - Monitor and troubleshoot DB2 Universal Database Command

– Options either added or extended for each release/fixpack

– “Proactive” FFDC: catching errors

Questions

Page 50: LUW 4 DAMA-UPC IBM Db2pd Monitoring

52

Appendix

Page 51: LUW 4 DAMA-UPC IBM Db2pd Monitoring

Miscellaneous monitoring tools

Page 52: LUW 4 DAMA-UPC IBM Db2pd Monitoring

54

Miscellaneous monitoring tools

Lock Timeout Report Tool

db2mc: The DB2 Monitoring Console - Open Source Project

– Light-weight, web-based console for DB2 for Linux, UNIX and Windows

– http://sourceforge.net/projects/db2mc

– Help: http://sourceforge.net/docman/?group_id=211760

db2top: Single System View Monitor for DB2

– Specifically designed for DPF environments

– The user interface is character-based and built using the curses library

– Unix only

– http://dl.alphaworks.ibm.com/technologies/db2top/db2top.pdf

Page 53: LUW 4 DAMA-UPC IBM Db2pd Monitoring

55

Troubleshooting lock timeouts

Lock Timeout Report Tool– Introduced in v9.5. Will be back ported to v9.1FP4 and v8.2F16

– Lock timeouts and deadlocks happen very frequently in customer situations

– End result is that some application (unit of work) fails and gets rolled back forcing the user to resubmit the work.

– It is very desirable to avoid lock timeouts and deadlocks in a production environment.

– The Lock Timeout Reporting tool makes debugging these and making the needed application changes to avoid them possible.

Enabled by setting registry variable DB2_CAPTURE_LOCKTIMEOUT– To enable/disable lock timeout reporting

db2set DB2_CAPTURE_LOCKTIMEOUT=ON

db2set DB2_CAPTURE_LOCKTIMEOUT=

Page 54: LUW 4 DAMA-UPC IBM Db2pd Monitoring

56

Troubleshooting lock timeouts (cont)

Lock timeout report

– Lock in contention• Lock name and type • Lock specifics, including row ID, table space ID, and table ID. Use this information to query the

SYSCAT.TABLES system catalog view to identify the name of the table.

– Lock Requestor

– Lock owner or representative– There can be more than one lock owner: the first lock owner is the representative for other lock owners

Lock timeout report files– Report file is generated by the agent receiving the lock timeout error

– When the lock timeout reporting function is active and a lock timeout occurs

– The report is stored in a file using the following name format: db2locktimeout.par.AGENTID.yyyy-mm-dd-hh-mm-ss, where– par is the database partition number. In non-partitioned database environments, par is set to 0.

– AGENTID is the agent ID.

– yyyy-mm-dd-hh-mm-ss is the time stamp, consisting of the year, month, day, hour, minute, and second.

– An example of a lock timeout report file name is /home/juntang/sqllib/db2dump/db2locktimeout.000.4944050.2006-08-11-11-09-43.

Page 55: LUW 4 DAMA-UPC IBM Db2pd Monitoring

57

Monitoring STMM

Tune a system from an out-of-the-box configuration to near-optimal memory usage in

an hour or less

Retrieve the current size of a buffer pool set to AUTOMATIC

– db2pd -database MYDB1 -bufferpools

– See CurrentSz column

Monitor STMM bufferpool changes

db2diag -g "message:=Altering bufferpool" db2diag.log

Monitor DB configuration changes

– db2diag -node 1 -g "changeevent:=STMM CFG DB" db2diag.log

Tool to parse the STMM log files

– parseStmmLogFile.pl <log file> <database name> <options>

– http://www.ibm.com/developerworks/db2/library/techarticle/dm-0708naqvi/

Page 56: LUW 4 DAMA-UPC IBM Db2pd Monitoring

58

Monitoring memory usage: db2mtrk

Provide complete report of memory status, for instances, databases

and agents

Outputs the following memory pool allocation information:

– Current size

– Maximum size (hard limit)

– Largest size (high water mark)

– Type (identifier indicating function for which memory will be used)

– Agent who allocated pool (only if the pool is private)

The "Other Memory" reported is the memory associated with the

overhead of operating the database management system

Page 57: LUW 4 DAMA-UPC IBM Db2pd Monitoring

db2top tool

Page 58: LUW 4 DAMA-UPC IBM Db2pd Monitoring

60

db2top: monitoring tool

Single System View Monitor for DB2

– Specifically designed for DPF environments

– The user interface is character-based and built using the curses library

– Unix only

– Uses the DB2 snapshot monitoring APIs to retrieve data

– Uses both global as well as partition-specific monitoring information to provide

• aggregation

• quick drill-down capabilities

db2top info

– db2top –h

– http://dl.alphaworks.ibm.com/technologies/db2top/db2top.pdf

db2top configuration file: .db2toprc

– Used to setup parameters at initialization time

– Location is determined by $DB2TOPRC

– Default is current directory (and home directory if not found)

– Type w in db2top to generate resource configuration file

Page 59: LUW 4 DAMA-UPC IBM Db2pd Monitoring

61

db2top Command Options

-n specifies the node to attach to.

-d specifies the database to monitor.

-u specifies the DB2 username used to access the database.

-p specifies the DB2 password.

-V specifies the default schema used in explains.

-i specifies the delay between screen updates.

-k specifies whether to display actual or delta values.

-R Reset snapshot at startup.

-P <number>. Snapshot issued on current or partition number.

-x specifies whether to display additionnal counters on session snd appl creens (might run slower on session).

-b tells db2top to run in background mode.

-a specifies only active queries will be displayed.

-C Runs db2top in snapshot collector mode, raw snapshot data is saved in <db2snap-<Machine>.bin>.

-f <file> </pattern> <+offset>. Run db2top in replay mode when snapshot data has been previously collected in <file>. offset will jump to a certain point in time in the file. It can be expressed in seconds (+10s), minutes (+10m) or hours (+10h). /pattern will analyse file and display at which offset matches appear. pattern can specified as regular expression.

-m Will limit duration of db2top in minutes for -b and -C.

-o <outfile>. Outfile for -b option.

-h short help.

Page 60: LUW 4 DAMA-UPC IBM Db2pd Monitoring

62

db2top Interactive Commands

d Goto database screen

l Goto sessions screen

a Goto application details for agent

G Toggle between all partitions and current partitions.

P Select db partition on which to issue snapshot.

t Goto tablespaces screen.

T Goto tables screen.

b Goto bufferpools screen.

D Goto the dynamic SQL screen.

m Display memory pools.

s Goto the statements screen.

U Goto the locks screen.

p Goto the partitions screen.

H Goto the history screen (experimental).

f Freeze screen.

W Watch mode for agent_id, os_user, db_user, application or netname.

/ Enter expression to filter data.

<|> Move to left or right of screen.

z|Z Sort on ascending or descending order.

c This option allows to change the order of the columns displayed on the screen.

S Run native DB2 snapshot.

L Allows to display the complete query text from the SQL screen.

R Reset snapshot data.

i Toggle idle sessions on/off.

k Toggle actual vs delta values.

g Toggle graph on/off.

X Toggle extended mode on/off.

C Toggle snapshot data collector on/off.

V Set default explains schema.

O Display session setup.

w Write session settings to .db2toprc.

q Quit db2top.

Page 61: LUW 4 DAMA-UPC IBM Db2pd Monitoring

63

db2top example

Running db2top monitoring utility in interactive mode in a DPF environment– db2top -d TEST -n mynode -u user -p passwd -V skm4 -B -i 1

– Command parameters are as follows:

– -d TEST --> database name

– -n mynode --> node name

– -u user --> user id

– -p passwd --> password

– -V skm4 --> Schema name

– -B --> Bold enabled

– -i 1 --> Screen update interval: 1 second

Bufferpool snapshot– b command

lqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqkx x 25%x 50%x 75%x 100%xxHit Ratio% x--------------------------------------------------xmqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj

Bufferpool Delta Delta Hit Async Delta Name l_reads/s p_reads/s Ratio% Reads% Writes/s --------------- ------------ ------------ ------- ------- ------------IBMDEFAULTBP 23015 241 98.95% 100.00% 212 IBMSYSTEMBP16K 0 0 0.00% 0.00% 0 IBMSYSTEMBP32K 0 0 0.00% 0.00% 0 IBMSYSTEMBP4K 0 0 0.00% 0.00% 0 IBMSYSTEMBP8K 0 0 0.00% 0.00% 0

Page 62: LUW 4 DAMA-UPC IBM Db2pd Monitoring

64

Bufferpool performance

Determine performance of buffer pools– In real time and over time interval

Use performance analysis option of db2top (-A) for bufferpools– db2top -d sample -A -b b

---- Top twenty performance report for 'Bufferpools' between 13:19:52 and 13:20:18-- Sort criteria 'Pages_VctIOs/s'--

Rank Bufferpool_Name Percentage fromTime toTime sum(Pages_VctIOs/s)----- ------------------------------ ----------- -------- --------- ------------------------------

1 IBMDEFAULTBP 100.0000% 13:19:52 13:20:18 182 IBMSYSTEMBP8K 0.0000% 13:19:52 13:20:18 03 IBMSYSTEMBP4K 0.0000% 13:19:52 13:20:18 04 IBMSYSTEMBP16K 0.0000% 13:19:52 13:20:18 05 IBMSYSTEMBP32K 0.0000% 13:19:52 13:20:18 0

---- Performance report, breakdown by 300 seconds --

fromTime sum(Pages_VctIOs/s) Percentage Top Five in 300 seconds interval-------- ------------------------------ ---------- +----------------------------------------------+

18 100.0000% |Rank|Percentage|Bufferpool_Name |- - | 1| 100.0000%|IBMDEFAULTBP |- - | 2| 0.0000%|IBMSYSTEMBP8K |- - | 3| 0.0000%|IBMSYSTEMBP4K |- - | 4| 0.0000%|IBMSYSTEMBP16K |- - | 5| 0.0000%|IBMSYSTEMBP32K |

+----------------------------------------------+---- Performance report, breakdown by 0.5 hour --

fromTime sum(Pages_VctIOs/s) Percentage Top Five in 0.5 hour interval-------- ------------------------------ ---------- +----------------------------------------------+

18 100.0000% |Rank|Percentage|Bufferpool_Name |- - | 1| 100.0000%|IBMDEFAULTBP |- - | 2| 0.0000%|IBMSYSTEMBP8K |- - | 3| 0.0000%|IBMSYSTEMBP4K |

Page 63: LUW 4 DAMA-UPC IBM Db2pd Monitoring

65

Bufferpool performance (cont)

db2top will report in real time

– data hit ratio for each bufferpool

– hit ratio for all tablespaces

– logica/physical read/writes per second per bufferpool

– efficiency of the prefetch (% of unread prefetch pages)

– whether block based bufferpools are used

– overall temp/data/index hit ratio

db2top collector mode

– Gather a lot of data at once and replay it back later

– db2top -C -d sample

– Start collection

– Answer N to „create named pipe‟ question

– db2top -f db2snap-sample-AIX64.bin -d sample

– Replay collection in another window

– db2top does not need to attach to the DB2 instance in replay mode

– Convenient for remote monitoring

– Limit the content and size of the stream file

– specify any of the suboptions available to the -C switch

– db2top -C b -d sample

Page 64: LUW 4 DAMA-UPC IBM Db2pd Monitoring

66

db2top: Bottleneck analysis

Type „B‟ when in interactive mode

lqqqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqqwqqqqqqqqqqqk

x x 25%x 50%x 75%x 100%x

xwait lock ms x x

xsort ms x x

xbp r/w ms x ------------------------------------------------ x

xasync r/w ms x ----------------------- x

xpref wait ms x ---------- x

xdir r/w ms x x

mqqqqqqqqqqqqqqvqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj

Server Top Resource Resource Application

Resource Agent Usage Value Name

-- ------------ -------- -------- ---------------- --------------------

=> Cpu N/A 0% 0 N/A

=> SessionCpu 1223 99.87% 4.844 db2bp

=> IO r/w 1223 100.00% 4894 db2bp

=> Memory 1223 50.00% 384.0K db2bp

=> Locks N/A 0% 0 N/A

=> Sorts N/A 0% 0 N/A

=> Sort Time N/A 0% 0 N/A

=> Log Used 1223 100.00% 2.1M db2bp

=> Overflows N/A 0% 0 N/A

=> RowsRead 1223 100.00% 8960 db2bp

=> RowsWritten 1223 100.00% 8960 db2bp

=> TQ r/w N/A 0% 0 N/A

=> MaxQueryCost N/A 0% 0 N/A

Page 65: LUW 4 DAMA-UPC IBM Db2pd Monitoring

Overview of new db2pd features in v9.1

Page 66: LUW 4 DAMA-UPC IBM Db2pd Monitoring

db2pdcfg tool

Page 67: LUW 4 DAMA-UPC IBM Db2pd Monitoring

69

db2pdcfg tool – new in v9.1

Configure DB2 database for problem determination behavior command

Sets flags in the DB2 database memory sets to influence the database

system behavior for problem determination purposes

db2pdcfg –help

Clear separation between db2pd and db2pdcfg tools

– db2pd is the “read-only” tool

– It will never write to DB2 shared memory or change anything

– db2pdcfg is the “read-write” tool

– Sets and displays parameters used

db2pdcfg will be substantially extended in post-v9.1 releases

Page 68: LUW 4 DAMA-UPC IBM Db2pd Monitoring

db2cos tool in v9.1

Page 69: LUW 4 DAMA-UPC IBM Db2pd Monitoring

71

DB2 call-out script improvements in v9.1

In v9.1, importance of db2cos for PD is significantly increased

New cases of automatic invocation

– Panic, trap, segmentation violation or exception

– Configurable by db2pdcfg

– ON by default

– Diagnostic dumps

– Configurable by db2pdcfg

– OFF by default

Addition of Windows OS support: db2cos.bat

Standardized place for db2cos script

– No need to copy file from sqllib/cfg

– $HOME/sqllib/bin on Unix

– %DB2PATH%\bin on Windows

Standardized place for db2cos output files

– directory specified by the DIAGPATH database manager configuration parameter

Page 70: LUW 4 DAMA-UPC IBM Db2pd Monitoring

72

DB2 call-out script improvements in v9.1

Standardized header for db2cos report files

2006-09-07-01.32.32.481578

PID : 7057616 TID : 1 PROC : db2cos

INSTANCE: dabrashk NODE : 0 DB : LDSORTDB

APPHDL : APPID: *LOCAL.dabrashk.060907053224

FUNCTION: oper system services, sqloEDUCodeTrapHandler, probe:999

EVENT : Invoking /home/dabrashk/sqllib/bin/db2cos from oper system services

sqloEDUCodeTrapHandler

Trap Caught

Instance dabrashk uses 64 bits and DB2 code release SQL09010

<detailed db2pd output follows>

Speed and scalability improvement for large customer systems

– db2cos output files are named db2cosXXXYYY.ZZZ, where XXX is the process ID

(PID), YYY is the thread ID (TID) and ZZZ is the database partition number (or 000 for

single partition databases)

– Necessitated by db2pdcfg –catch feature (especially for catching errors)

Page 71: LUW 4 DAMA-UPC IBM Db2pd Monitoring

Diagnosing traps

Page 72: LUW 4 DAMA-UPC IBM Db2pd Monitoring

74

Call-out script (db2cos) trap output (v9.1)

Increased control over the set of diagnostic information produced when the database manager encounters a panic, trap, exception, or segmentation violation

– In such situations, the db2cos script is now automatically run

DB2 call-out script is executed on traps

– search for "TRAP“ in db2cos script

To disable generation of db2cos report in trap scenarios

– db2pdcfg –cos off

Output goes to db2cos<pid><tid>.<node> file

– db2cos48621401.0

To print the status

– db2pdcfg –cos status

Contains output of db2pd command (all options)

– db2pd –inst OR db2pd -db $database -inst

– You can edit the db2cos script to collect more or less information

Page 73: LUW 4 DAMA-UPC IBM Db2pd Monitoring

75

Call-out script (db2cos) trap output (v9.1)

START and STOP are logged into db2diag.log– db2diag -g "funcname=pdInvokeCalloutScript“

– START : Invoking /home/dabrashk/sqllib/bin/db2cos from oper system services sqloEDUCodeTrapHandler

To specify number of times to execute db2cos during a trap– db2pdcfg –cos count=<count>

– default is 255

Specify how long to sleep between checking the size of the output file generated by db2cos

– db2pdcfg –cos sleep=<numsec>

– default is 3 seconds

Specify how long of a timeout when checking if the output file generated by db2cos is growing in size

– db2pdcfg –cos timeout=<numsec>

– default is 30 seconds

Instruct the database manager to execute db2cos when receiving SQLO_SIG_DUMP signal (db2pd –dump; kill -36 <agentPID> on AIX)

– db2pdcfg –cos SQLO_SIG_DUMP

Page 74: LUW 4 DAMA-UPC IBM Db2pd Monitoring

76

db2pd additions in v9.1

Page 75: LUW 4 DAMA-UPC IBM Db2pd Monitoring

77

Overview of new db2pd features for DB2 v9

PD Infrastructure enhancements in v9.1

– db2pdcfg tool

– db2cos tool improvements

Improvements for diagnosing traps

v9.1 db2pd additions

Troubleshooting and monitoring using db2pd

Page 76: LUW 4 DAMA-UPC IBM Db2pd Monitoring

78

db2pd additions in v9.1

New commands

– -latches

– -fmp

– -pages

– -memblocks

– -dump

New options

– -bufferpools command: Bufferpool ID that contains the page

– -fcm command: high water mark option (hwm)

db2pd usability improvements

Page 77: LUW 4 DAMA-UPC IBM Db2pd Monitoring

79

db2pd usability improvements

Agents

– v8.2: -agents [db=<database>] [ [agent=<agentid>] | [application=<appid>] ]

– v9.1: -agents [db=<database>] [[<AgentId> | [app=<AppHandl>]]

Applications

– v8.2: -applications [ [application=<appid>] | [agent=<agentid>] ]

– v9.1: -applications [[<AppHandl> | [agent=<AgentId>]]

Transactions

– v8.2: -transactions [tran=<tranhdl>] [app=<apphdl>]

– v9.1: -transactions [<TranHdl> | [app=<AppHandl>]]

Locks

– v8.2: -locks [tran=<tranhdl>] [showlocks] [wait]

– v9.1: -locks [<TranHdl>] [showlocks] [wait]

Table Control Block Stats

– v8.2: -tcbstats [all|index] [tbspaceid=<tbspaceid> [tableid=<tableid>]]

– v9.1: -tcbstats [all|index] [<TbspaceID> [<TableID>]]

Tablespaces/Containers

– v8.2: -tablespaces [group] [tablespace=<tablespace id>]

– v9.1: -tablespaces [<Tablespace ID>] [group]

Page 78: LUW 4 DAMA-UPC IBM Db2pd Monitoring

80

Backup slides

Page 79: LUW 4 DAMA-UPC IBM Db2pd Monitoring

81

db2pd -transactions states

-transactions “State” field FREE - Free State

READ - Read, no log record written

WRITE - Log record written

COMMIT - In commiting state

ABORT - In aborting state

ABORTDL - In aborting state, needs to be aborted on datalink file servers

SAVEPNT - In rollback to savepoint

PREP - Prepared transaction

HCOMT - heurestically commited

HABRT - heurestically rolled back

HAING - heurestically rolling back, used by recovery if further rollback processing

required

FRG - The transaction is forgotting

REPAIR - Transaction needs repair due to I/O errors on tablespaces it uses or PIT

tablespace rollforward

REUSE - Federated transaction has reached beginning of federated two-phase-commit

processing. If no other log record follows, then this signals prepare is not fully

finished and transaction needs to be rolled back at resync time.

Page 80: LUW 4 DAMA-UPC IBM Db2pd Monitoring

82

Mapping Application ID to IP address and port

A TCP/IP-generated application ID is composed of three sections (v8.2)

– IP address represented as a 32-bit number (8 hexadecimal chars)

– port number (4 hexadecimal characters)

– unique identifier for the instance of this application

When IP address or port number begin with 0-9, they are changed to G-P respectively. For example, "0" is mapped to "G", "1“ is mapped to "H", and so on.

The IP address, AC10150C.NA04.006D07064947 is interpreted as follows:

– The IP address remains AC10150C, which translates to 172.16.21.12.

– The port number is NA04. The first character is "N", which maps to "7". Therefore, port number is 7A04, which translates to 31236 in decimal form.

The IP address and port can be used with lsof to find out which remote application is using the application ID

AC10150C.NA04.006D07064947

IP Address Port Generated ID