1 how healthy is your progress system? ( progess db best practices) dan foreman bravepoint, inc....

1

How Healthy is Your How Healthy is Your Progress System?Progress System?

((Progess DB Best Practices)Progess DB Best Practices)

Dan ForemanDan Foreman

BravePoint, Inc.BravePoint, Inc.

[email protected]@prodb.com

2

Introduction- Dan ForemanIntroduction- Dan Foreman

Progress User since 1984Progress User since 1984

Guest Speaker at USA & European Guest Speaker at USA & European Progress Users Conferences since 1988Progress Users Conferences since 1988

3

Introduction- Dan ForemanIntroduction- Dan ForemanAuthor of: Author of:

Progress Performance Tuning GuideProgress Database Admin GuideProgress System Tables GuideV10 Database Admin JumpstartOnline Access (free with paper book)ProMonitor - Performance Monitoring ToolPro D&L - Dump/Load with very short downtime regardless of DB sizeBalanced Benchmark – Load Testing Tool

4

Introduction - BravePointIntroduction - BravePointThe Largest(?) Progress consulting The Largest(?) Progress consulting group in the world (managing one of the group in the world (managing one of the world’s largest databases)world’s largest databases)

Three have used Progress since 1984Three have used Progress since 1984

Database Group:Database Group:Managed DBA Services

Performance Tuning

Database Repair (and proactive protection)

Load Testing

Much more

5

Introduction – Who Are You?Introduction – Who Are You?

Largest Single DB?Largest Single DB?

Largest Concurrent DB Connections?Largest Concurrent DB Connections?

Progress Version?Progress Version?

Database Operating System?Database Operating System?

6

Best Practices - RecoveryBest Practices - Recovery

Test your Test your EntireEntire Recovery Plan at least Recovery Plan at least once a yearonce a year

Verify Progress backups with Verify Progress backups with prorestprorest and –vp or –vfand –vp or –vf

Log Log allall activities related to backups, AI activities related to backups, AI maintenance, and other automated maintenance, and other automated activitiesactivities

Generate an Alert (i.e. email, SMS, etc.) Generate an Alert (i.e. email, SMS, etc.) if any activity related to backup/AI failsif any activity related to backup/AI fails

7

Best Practices – After ImagingBest Practices – After Imaging

Enable After Imaging (AI)Enable After Imaging (AI)

Verify After Image Logs with Verify After Image Logs with rfutil rfutil aiverifyaiverify

Minimum: Copy AI Logs to a non-Minimum: Copy AI Logs to a non-production server frequently (every production server frequently (every few minutes)few minutes)

Best: Use the AI logs to Replication Best: Use the AI logs to Replication production DB on another server as a production DB on another server as a Hot or Warm StandbyHot or Warm Standby

8

Best Practices – After ImagingBest Practices – After Imaging

Keep archived AI logs in a separate Keep archived AI logs in a separate location from the backupslocation from the backups

Keep archived AI logs as long as you Keep archived AI logs as long as you keep the backupskeep the backups

Keep the live AI extents extents as far Keep the live AI extents extents as far “away” from the DB/BI files as possible“away” from the DB/BI files as possible

Separate physical disk

Separate LUN (SAN

Separate Volume Group

Separate Logical Volume/File System

9

Best Practices – Unix/LinuxBest Practices – Unix/LinuxUnix/Linux: DO NOT logon as root Unix/Linux: DO NOT logon as root unless you really need tounless you really need to

Use sudoUse a root equivalent account

Use O/S security to protect the DB, BI, Use O/S security to protect the DB, BI, and AI files from and AI files from accidental/casual/intentional deletionaccidental/casual/intentional deletionproutil EnableLargeFilesproutil EnableLargeFiles on each on each database and make sure all file database and make sure all file systems support large filessystems support large files

10

Best Practices – Unix/LinuxBest Practices – Unix/LinuxDon’t use Don’t use kill -9kill -9 to terminate a Self to terminate a Self Service Progress session; You might Service Progress session; You might bring the database DOWN! if you bring the database DOWN! if you happen to kill a session that is happen to kill a session that is holding a Latchholding a Latch

11

Best Practices – DB MaintBest Practices – DB Maint

Always have an up-to-date Structure Always have an up-to-date Structure (.st) file available(.st) file available

Run Run proutil dbanalysproutil dbanalys periodically periodicallyCan find certain errors such as #1124

Scatter and Fragmentation Information indicates if a Dump&Load is needed

Monitor Table growth rates

Elapsed time to run the utility is a performance indicator

12

Best Practices – DB MonitoringBest Practices – DB MonitoringCheck the Database log (.lg) file for Check the Database log (.lg) file for errors DAILY. Look for words such as:errors DAILY. Look for words such as:kill* drastic warn* error system dead fatal kill* drastic warn* error system dead fatal abnormal exceed* fail* wrong unexpected* invalid abnormal exceed* fail* wrong unexpected* invalid died damage* overflow* violation insufficient died damage* overflow* violation insufficient missing disappear* corrupt* allow* attempt* missing disappear* corrupt* allow* attempt* cannot enough illegal beyond impossible increase cannot enough illegal beyond impossible increase unknown unable stop*unknown unable stop* (and many more) (and many more)

Use OpenEdge Management or Use OpenEdge Management or ProMonitorProMonitor to assist with log file to assist with log file monitoring or write your own (not so monitoring or write your own (not so easy)easy)

13

Best Practices – DB MonitoringBest Practices – DB Monitoring

Important because Important because promonpromon & Virtual & Virtual System Tables don’t show history & System Tables don’t show history & trendstrends

ProMonitorProMonitor

ProTopProTop

OpenEdge ManagementOpenEdge Management

Build your ownBuild your own

14

Best Practices – DB SafetyBest Practices – DB Safety

Use the -Use the -bitholdbithold parameter as an extra parameter as an extra safeguard; Set to 50% of available BI safeguard; Set to 50% of available BI Disk SpaceDisk Space

Crash recovery causes the BI file to grow

Crash recovery causes the AI files to grow

AI extents cannot be emptied during crash recovery

bigrowbigrow size < BI Size Alert Threshold < size < BI Size Alert Threshold <

((-bithold-bithold value = (available BI disk space / 2)) value = (available BI disk space / 2))

15

Best Practices – DB TuningBest Practices – DB Tuning

Spin Locks (-spin) between 1000 and Spin Locks (-spin) between 1000 and 100000100000

Why such a wide range?

BI Buffers (-bibufs) 32-64BI Buffers (-bibufs) 32-64

AI Buffers (-aibufs) exactly equal to BI AI Buffers (-aibufs) exactly equal to BI BuffersBuffers

BI Block Size (-biblocksize) 16BI Block Size (-biblocksize) 16

AI Block Size (-aiblocksize) exactly AI Block Size (-aiblocksize) exactly equal to BI Block Sizeequal to BI Block Size

16


Page WritersPage WritersDB Writers (APWs): 2-4

BI Writer (BIW): 1

AI Writer (AIW): 1

Before Image Cluster Size: 16-32mbBefore Image Cluster Size: 16-32mb

Pre-Formatting BI Clusters if BI Pre-Formatting BI Clusters if BI truncatedtruncated

proutil bigrow

17


Database Buffers (-B) - lotsDatabase Buffers (-B) - lots

Don’t use the Don’t use the promonpromon ‘Buffer Hits %’ ‘Buffer Hits %’ to monitor – Prior to V10.1B it is to monitor – Prior to V10.1B it is buggy and frequently wrongbuggy and frequently wrong

V10.2B SP04 Alternate Buffer Cache V10.2B SP04 Alternate Buffer Cache -B2

For heavy read-mostly tables (and associated indexes) that fit completely in the memory allocated

18


Use Buffer Hit RatioUse Buffer Hit Ratio

Ratio of: DB Requests / DB ReadsRatio of: DB Requests / DB Reads3 digits:1 is usually excellent

Higher than that usually indicates bad code

Lower than 20:1 is usually poor performance

A Ratio is a better indicator especially A Ratio is a better indicator especially if the percentage is approaching 100%if the percentage is approaching 100%

19


Possible Reasons for a Poor Hit RatioPossible Reasons for a Poor Hit RatioOne report looking at ‘old’ data (i.e. YTD data) can kill a good Hit Ratio although the “dip” usually temporary

Database needs a dump/load (poor Scatter and/or Fragmentation Factors)

See more on the next slide

20


Possible Reasons for a Poor Hit RatioPossible Reasons for a Poor Hit Ratio-B is too small

Online utilities (dbanalys, probkup, etc.); use –Bp to reduce the impact

Reports with indexing problems

Reports run “wide open”

The Hit Ratio was checked soon after the DB Broker started

21


Lots of misinformation &opinions Lots of misinformation &opinions about Direct I/O (-directio)about Direct I/O (-directio)

Added in V6 but only applied to Data Added in V6 but only applied to Data General and Sequent PlatformsGeneral and Sequent Platforms

Starting in V8 applies to Starting in V8 applies to allall platforms platforms but the Progress Documentation but the Progress Documentation wasn’t updated right awaywasn’t updated right away

Database Startup OptionDatabase Startup Option

22


But -directio isn’t a good idea for all But -directio isn’t a good idea for all platformsplatforms

Don’t use on:Don’t use on:Windows

Linux

23

Best Practices – DB StructureBest Practices – DB Structure

Database Block Size 4k-8kDatabase Block Size 4k-8k

General: Match DB Block Size to File General: Match DB Block Size to File System Block SizeSystem Block Size

Set the File System Block Size is as Set the File System Block Size is as large as possiblelarge as possible

Increase in DB Block Size may mean a Increase in DB Block Size may mean a reduction in –Breduction in –B

Dump/Load is required to change the Dump/Load is required to change the Block SizeBlock Size

24


Fixed Size Data ExtentsFixed Size Data Extents

Don’t grow into the Variable ExtentDon’t grow into the Variable Extent

Not as crucial as it was in the 90’s Not as crucial as it was in the 90’s when Storage was slowerwhen Storage was slower

Large Extents (as large as triple digit Large Extents (as large as triple digit gigabytes) are not a performance gigabytes) are not a performance problem if you are using a “modern” problem if you are using a “modern” OS and Storage DeviceOS and Storage Device

25


V10V10Type 2 (AKA T2) Storage Areas

General recommendations:T2 for ALL Areas

Large Cluster Size (512) for Tables with a large number of records

Smallest Cluster Size (8) for Tables with few records

26

Best Practices – Dump & LoadBest Practices – Dump & Load

To Fix Scatter Factor To Fix Scatter Factor Not as big an issue with T2 Areas

To Fix FragmentationTo Fix Fragmentation

To Change T2 DCS, RPB, DB Blk SizeTo Change T2 DCS, RPB, DB Blk Size

To verify no DB Corruption ExistsTo verify no DB Corruption Exists

So that if you need to do one in an So that if you need to do one in an Emergency, it won’t be your first timeEmergency, it won’t be your first time

Usually much more effective than Usually much more effective than idxbuildidxbuild or or idxcompactidxcompact

27

Best Practices - DiskBest Practices - Disk

Disks are the Slowest Server Disks are the Slowest Server ComponentComponent

We recommend We recommend LotsLots of Striped of Striped Database DisksDatabase Disks

1999: 9gb & 9-14ms Average Access1999: 9gb & 9-14ms Average Access

2009: 144gb & 6-9ms Average Access2009: 144gb & 6-9ms Average Access

2012: SSD are < .2ms Average Access2012: SSD are < .2ms Average Access

28


Separation of After Image, Before Separation of After Image, Before Image, and Database DisksImage, and Database Disks

Mainly for Integrity (especially AI)Mainly for Integrity (especially AI)

Secondarily for Performance (maybe)Secondarily for Performance (maybe)

Try to not Stripe DB/BI on the same Try to not Stripe DB/BI on the same VolumeVolume

29

Worst Practice – RAID 5 (and Variants)Worst Practice – RAID 5 (and Variants)

RAID 5 is (almost) always EVIL!RAID 5 is (almost) always EVIL!

RAID Levels are not precisely crafted RAID Levels are not precisely crafted standards (like USB 3.0, etc.)standards (like USB 3.0, etc.)

SANs are very complex devicesSANs are very complex devices

RAID 10 requires more disk space RAID 10 requires more disk space than RAID 5than RAID 5

Hybrids may be acceptable (RAID 5 Hybrids may be acceptable (RAID 5 for DB, RAID 10 for AI/BI)for DB, RAID 10 for AI/BI)

YMMVYMMV

30


Stripe Size for RAID 0, 5, 6, or 10Stripe Size for RAID 0, 5, 6, or 10

The Largest Stripe Size usually The Largest Stripe Size usually produces the best Performanceproduces the best Performance

YMMV (or YKMV for some of our YMMV (or YKMV for some of our international audience)international audience)

31

Best Practices – What you don’t Best Practices – What you don’t know can hurt youknow can hurt you

Have a third party look at your system Have a third party look at your system once a yearonce a year

Doesn’t need to be me - competition Doesn’t need to be me - competition is goodis good

You may be surprised at what you’ve You may be surprised at what you’ve missed or has slipped through the missed or has slipped through the crackscracks

It’s like car insurance…It’s like car insurance…

32

ConclusionConclusion

If you need further assistance:If you need further assistance:Progress Performance Tuning Guide

Progress Database Administration Guide

Progress System Tables

V10 Database Administration Jumpstart

ProMonitor - performance monitoring tool

Pro Dump/Load

Balanced Benchmark

[email protected] or [email protected]

Thank You for Coming!Thank You for Coming!

1 how healthy is your progress system? ( progess db best practices) dan foreman bravepoint, inc....

Documents

progress backups

best practices db monitoring

best practices db maint

latch slide

progress version

database group

ai files

imaging ai