whitepaper informix on power 7 best practices final

48
 IBM Information Management IBM  I nf or mix on P OWER7 Best Practices  A T ec hnical  White Paper 

Upload: danielvp21

Post on 08-Oct-2015

23 views

Category:

Documents


0 download

DESCRIPTION

informix

TRANSCRIPT

  • IBM Information Management

    IBM Informix on POWER7Best Practices

    A Technical White Paper

  • IBM Informix on POWER7 Best Practices 2

    ContentsIntroduction ..................................................................................................................... 5

    What is an LPAR............................................................................................................ 5

    PowerVM........................................................................................................................ 5

    Database workload......................................................................................................... 6

    Mission critical .....................................................................................................7

    High throughput ...................................................................................................7

    Low throughput ....................................................................................................7

    Simultaneous multithreading (SMT)................................................................................ 7

    Current SMT mode ..............................................................................................8

    Enabling SMT ....................................................................................................10

    SMT1 vs SMT2 vs SMT4 .............................................................................................. 11One core SMT testing....................................................................................................11

    Multiple core SMT testing ..............................................................................................12

    Recommendation ..........................................................................................................12

    Dedicated LPAR vs shared LPAR................................................................................. 13Dedicated LPAR............................................................................................................13

    Shared LPAR ................................................................................................................14

    Capped or uncapped shared LPAR ...................................................................15

    Virtual processors ..............................................................................................16

    Processor folding ...............................................................................................18

    Recommendation ..........................................................................................................19

    Virtual I/O Server (VIOS) LPAR .................................................................................... 20Recommendation ..........................................................................................................20

    Additional LPAR recommendations............................................................................... 21Recommendation ..........................................................................................................21

    Memory considerations ................................................................................................. 22RESIDENT parameter ...................................................................................................22

    4 KB memory page size ................................................................................................23

    16 MB memory page size ..............................................................................................23

    64 KB memory page size...............................................................................................24

    Recommendation ..........................................................................................................25

  • IBM Informix on POWER7 Best Practices 3

    Feedback Directed Program Restructuring (FDPR) Tool .............................................. 26Recommendation ..........................................................................................................28

    I/O subsystem ............................................................................................................... 29Read/Write access times...............................................................................................29

    KAIO/DIRECT_IO..........................................................................................................30

    Queue depth..................................................................................................................31

    AIO servers ...................................................................................................................32

    Recommendation ..........................................................................................................33

    Network subsystem....................................................................................................... 34TCP traffic .....................................................................................................................34

    Local loopback ..............................................................................................................34

    Recommendation ..........................................................................................................35

    Number of CPU virtual processors................................................................................ 36CPU-intensive workload ................................................................................................36

    I/O-intensive workload ...................................................................................................37

    Recommendation ..........................................................................................................38

    Affinity ........................................................................................................................... 39Recommendation ..........................................................................................................39

    Understanding onstat g glo ......................................................................................... 40Recommendation ..........................................................................................................41

    Starting LPARs.............................................................................................................. 42Recommendation ..........................................................................................................42

    Appendix A: Recommendations summary.................................................................. 43Appendix B: Useful commands................................................................................... 44

    amepat ..........................................................................................................................44

    bosboot .........................................................................................................................44

    chdev ............................................................................................................................44

    ifconfig...........................................................................................................................44

    ioo .................................................................................................................................44

    iostat .............................................................................................................................44

    lparstat ..........................................................................................................................44

    lsattr ..............................................................................................................................45

    schedo...........................................................................................................................45

    smtctl.............................................................................................................................45

  • IBM Informix on POWER7 Best Practices 4

    vmo ...............................................................................................................................45

    vmstat............................................................................................................................45

    Appendix C: Additional reading ............................................................................... 46DeveloperWorks: AIX Virtual Processor Folding is misunderstood.....................46

    IBM Systems: Understanding Micro-Partitioning..............................................46

    IBM Systems: Getting a handle on Entitled Capacity & Virtual Processors .......46

    YouTube: Power7 Performance Entitlement, VPs, Affinity, Memory..............46

    Feedback Directed Program Restructuring (FDPR) ...........................................46

    Developer Works: VIOS Advisor .......................................................................46

    IBM Redbooks Publication: IBM PowerVM Virtualization Managing andMonitoring..........................................................................................................46

    IBM Redbooks Publication: AIX 5L Performance Tools Handbook....................46

    Appendix D: References.......................................................................................... 47

  • IBM Informix on POWER7 Best Practices 5

    Introduction

    This document describes best practices for using IBM Informix on AIX POWER7 processor-based servers. Topics of discussion include logical partitions (LPARS), dedicated and sharedresources, capped LPARS compared with uncapped configurations, and I/O configuration.

    It is assumed that you have a working knowledge of Informix and are familiar with physical andlogical database design for Informix. You also need to have skills in Informix serveradministration, and be familiar with configuration and tuning of the Informix server.

    You should have basic skills in working with LPARS and system administration on AIXPOWER7 systems.

    What is an LPAR

    An LPAR, short for logical partition, is the division of a computers processors, memory, andstorage into smaller units. Each unit can run its own instance of the operating system andapplications. This concept was introduced with the POWER5 processor.

    PowerVM

    IBM PowerVM provides a secure, stable, and sophisticated virtualization environment for IBMPower SystemsTM. A single physical server can be divided into multiple virtual servers using afraction of a processor to using all the processors on the physical machine. POWER7 systemssupport up to 1000 LPARs on a single server.

    Businesses can deploy an appropriate mix of LPARs to meet their needs, sharing resourceswhere applicable, or by using fully dedicated resources as needed. With PowerVM, you havethe flexibility of a heterogeneous environment with the LPARs running a combination of AIX andLinux operating systems.

    Through the use of virtualization, PowerVM has the ability to respond to business needs fasterby dynamic resource allocation. The Power architecture also provides simultaneousmultithreading (SMT), which allows increased throughput on your Power system.

  • IBM Informix on POWER7 Best Practices 6

    Database workload

    Before we address best practices for Informix on the POWER7 architecture, you mustunderstand the type of workload that you expect to have. A mission critical database server thatruns a consistent workload with a large number of users has different requirements than adatabase server that has a light workload where that work is sparse throughout the day.

    Dedicated partitions

    2 CPU

    AIX

    7.1

    AIX

    6.1

    10 CPUAI

    X6.

    21 CPU

    Power Hypervisor

    Dedicated processors Physical shared-processor pool

    Shared Process Pool 0 Shared Process Pool 1

    2.5

    CPU

    AIX

    7.1

    0.5

    CPU

    AIX

    6.1

    1.5

    CPU

    AIX

    6.2

    Micro-partitions Micro-partitions

    2.4

    CPU

    AIX

    7.1

    1.5

    CPU

    AIX

    6.1

    2C

    PUAI

    X6.

    2OLTPWorkload

    Primary Database Server24 x 7 x 365

    BatchLoads

    SecondaryDatabaseServer

    SalesReports

    BusinessAnalytics

    Month-endProcessing

  • IBM Informix on POWER7 Best Practices 7

    Mission critical

    A mission critical database is one that your business relies on, where the uptime andperformance is most important. This type of database server typically requires dedicatedresources with certain service level agreements (SLA) in place regarding throughput(transactions per second) and uptime. This type of system is important to the business.

    High throughput

    If your system has a high throughput (transactions per second), it is important to understandhow and when that workload uses the Informix database server. Is the work a consistentworkload that uses the database server 24 hours a day, or does the work come in bursts or atspecific times throughout the day? For instance, you might have a database server that has aconsistent, high workload from 8:30 a.m. to 5:30 p.m. during the business day, and a differentdatabase server that experiences a heavy workload from midnight to 4 a.m. Using a dedicatedsystem for each of these database servers wastes resources when one database server or theother is idle.

    Low throughput

    Some database servers have a very low throughput on a regular basis, requiring very littleprocessing power and resources. Perhaps data is loaded into the database server once a dayor at scheduled intervals, and otherwise the database server is idle. The data might be used torun end-of-day processing reports or possibly month-end processing reports. Placing adatabase server like that onto a large dedicated server with many CPUs would be a waste ofprocessing power.

    To configure a POWER7 system to handle the workload, its important to know which databaseservers will require more resources than others. It is important to understand when theworkload will run on the database servers. Knowing this information will help you decide how toconfigure LPARs. For example, a dedicated LPAR might be a good choice for a mission criticaldatabase server with a consistent, high throughput. However, a shared-resource LPAR mightbe a good option for a database server where the workload occurs at a specific time of day ornight, and the server is idle the rest of the time.

    Simultaneous multithreading (SMT)

    Simultaneous multithreading is the ability of a single processor to simultaneously dispatchinstructions from more than one hardware thread context. The Power architecture uses SMT toprovide multiple streams of hardware execution, and the POWER7 processor can be configuredto run in SMT4, SMT2, or SMT1 (single-threaded mode). By using multiple SMT threads, aworkload can take advantage of more of the hardware features that are provided on the Powersystem. POWER6 and POWER5 support SMT2 or SMT1.

    IBM Informix has performed a series of benchmarks comparing SMT4 with SMT2 and SM1.The results of these tests fall in line with industry benchmarks on POWER7 and with SMTtesting.

  • IBM Informix on POWER7 Best Practices 8

    Current SMT mode

    To determine the current SMT mode, you can use one of the following AIX commands: lparstat,amepat, smtctl.

    # lparstat

    System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513536MB%user %sys %wait %idle----- ----- ------ ------

    1.0 0.5 0.0 98.5

    The following sample output from the lparstat command shows that SMT4 is being used, andthat there are 128 logical CPUs, which means that there are 32 physical CPUs.

    LPAR/ SMT1lc

    pu0

    Core 1 Core 2

    lcpu

    1

    2 CPU

    AIX

    7.1

    LPAR/ SMT2

    lcpu

    0Core 1 Core 2

    lcpu

    1

    lcpu

    2lc

    pu3

    2 CPU

    AIX

    6.1

    LPAR/ SMT4

    lcpu

    0

    Core 1 Core 2

    lcpu

    1lc

    pu2

    lcpu

    3

    lcpu

    4lc

    pu5

    lcpu

    6lc

    pu7

    2 CPU

    AIX

    6.2

  • IBM Informix on POWER7 Best Practices 9

    # amepatCommand Invoked : amepat

    Date/Time of invocation : Mon Sep 23 13:07:51 CDT 2013Total Monitored time : NATotal Samples Collected : NA

    System Configuration:---------------------Partition Name : shakeProcessor Implementation Mode : POWER7Number of Logical CPUs : 128Processor Entitled Capacity : 32.00Processor Max. Capacity : 32.00True Memory : 501.50 GBSMT Threads : 4Shared Processor Mode : DisabledActive Memory Sharing : DisabledActive Memory Expansion : Disabled

    ..

    # smtctlThis system is SMT capable.This system supports up to 4 SMT threads per processor.SMT is currently enabled.SMT boot mode is not set.SMT threads are bound to the same physical processor.

    proc0 has 4 SMT threads.Bind processor 0 is bound with proc0Bind processor 1 is bound with proc0Bind processor 2 is bound with proc0Bind processor 3 is bound with proc0

    proc4 has 4 SMT threadsBind processor 4 is bound with proc4Bind processor 5 is bound with proc4Bind processor 6 is bound with proc4Bind processor 7 is bound with proc4

    .

    proc124 has 4 SMT threads.Bind processor 124 is bound with proc124Bind processor 125 is bound with proc124Bind processor 126 is bound with proc124Bind processor 127 is bound with proc124

  • IBM Informix on POWER7 Best Practices 10

    Enabling SMT

    Simultaneous multithreading is set at the LPAR level. An SMT setting for a particular LPAR willnot affect the settings for another LPAR.

    SMT can be enabled or disabled with the following smtctl command.

    smtctl -m {off|on}

    To set the SMT threads to 4, the following command can be used. This command affects thecurrent LPAR only, and the change is immediate.

    smtctl -t 4

    By default, the SMT change does not persist after the LPAR is rebooted. For an SMT change topersist after a reboot, the boot image must be remade with the bosboot command. See the fullman pages for the bosboot and smtctl commands.

  • IBM Informix on POWER7 Best Practices 11

    SMT1 vs SMT2 vs SMT4

    IBM Informix has performed extensive benchmark tests comparing results when using multiplethreads on POWER7. This type of testing is possible even when SMT4 is configured becausethe 1st thread is used for a processor, and when it nears full consumption, the 2nd thread isused, and so on. When all four threads are in use on a core, we see an increase in overallthroughput by approximately 60%. While the overall throughput increases, it is important tonote that single-thread response time does not scale linearly as more threads are used percore.

    One core SMT testing

    The following graph shows transaction throughput for a single core when 1, 2, 3, and 4 threadswere used. The transactions per minute (TPM) increased when each additional thread wasused.

    H/W threads throughput (TPM) diff%1 32893.672 46267.33 +40.7%3 51181.67 +10.6%4 53884.00 + 5.3%

  • IBM Informix on POWER7 Best Practices 12

    Multiple core SMT testing

    IBM Informix performed additional tests to measure throughput for SMT on multiple cores. Thefollowing graph shows the transaction throughput of 1, 2, 3, and 4 threads on 1 through 8 cores.

    RecommendationIf you are most concerned about overall throughput for your Informix server, use SMT4,because using SMT4 can more fully utilize the core. While tests showed a 60% increase inthroughput, keep in mind that single-thread response time does not scale linearly as more SMTthreads are used. If you want to optimize for response time, you can start with SMT4 for theincreased throughput, but if you see single-thread response time suffer, move to SMT2.

  • IBM Informix on POWER7 Best Practices 13

    Dedicated LPAR vs shared LPAR

    Virtualized environments offer many choices for deployment such as dedicated or non-dedicated processor cores and memory micro-partitioning, which uses a fraction of a physicalprocessor core. There are pros and cons to each approach, and it is important to understandhow the LPAR is to be used and its expected workload.

    Dedicated LPAR

    A dedicated LPAR is one that gets a specific set of resources. It will not grab additionalresources or release any of its resources. The following graphic shows three dedicated LPARs:LPAR #1 has 2 CPUs, LPAR #2 has 10 CPUs, and LPAR #3 has 1 CPU.

    One of the drawbacks of dedicated LPARs is that, if there is an over allocation of resources, youcan have a situation where the CPUs are idle. At the same time, there could be another LPARthat has used all of its resources and could benefit from increased resources. For example, inthe following graphic, LPAR #1 might be running at full capacity with CPU utilization near 100%,while LPAR #2 is running at 10% utilization. In that situation, LPAR #1 would benefit from usingresources that LPAR #2 is not using.

    Dedicated partitions

    2 CPU

    AIX

    7.1

    AIX

    6.1

    10 CPU

    AIX

    6.2

    1 CPU

    Power Hypervisor

    Dedicated processors Physical shared-processor pool

    Shared Process Pool 0 Shared Process Pool 1

    2.5

    CPU

    AIX

    7.1

    0.5

    CPU

    AIX

    6.1

    1.5

    CPU

    AIX

    6.2

    Micro-partitions Micro-partitions

    2.4

    CPU

    AIX

    7.1

    1.5

    CPU

    AIX

    6.1

    2C

    PUAI

    X6.

    2LPAR

    #1

    LPAR#2

  • IBM Informix on POWER7 Best Practices 14

    Shared LPAR

    A shared LPAR, sometimes referred to as a non-dedicated LPAR, is an LPAR that is assigned aminimum set of resources, and that may use more resources from a shared pool, as needed, ifthe additional resources are available. This method also has pros and cons.

    For example, assume that you defined a set of shared LPARs as shown in the following graphic.If LPAR #1 consumes 100% of its 2.5 CPUs, it can use additional resources from the sharedpool to allow increased throughput on that LPAR. However, if LPAR #2 pulls all the availableresources from the shared pool, and LPAR #1 becomes 100% consumed, LPAR #1 will not beable to use additional resources, and its performance will suffer.

    The question is: Should you use shared LPARs or dedicated LPARs? IBM Informix tests showthat a properly configured shared LPAR, in ideal conditions, can perform nearly as well as adedicated LPAR (see the following graph). However, one of the benefits that you get with adedicated LPAR is that the LPAR is much easier to configure and monitor, and it will give youconsistent results. A shared LPAR has factors that are out of your control, and that might causevariations in throughput results.

    Dedicated partitions

    2 CPU

    AIX

    7.1

    AIX

    6.1

    10 CPU

    AIX

    6.2

    1 CPU

    Power Hypervisor

    Dedicated processors Physical shared-processor pool

    Shared Process Pool 0 Shared Process Pool 1

    2.5

    CPU

    AIX

    7.1

    0.5

    CPU

    AIX

    6.1

    1.5

    CPU

    AIX

    6.2

    Micro-partitions Micro-partitions

    2.4

    CPU

    AIX

    7.1

    1.5

    CPU

    AIX

    6.1

    2C

    PUAI

    X6.

    2

    LPAR#1

    LPAR#2

  • IBM Informix on POWER7 Best Practices 15

    Capped or uncapped shared LPAR

    For a capped shared LPAR, the entitlement capacity is the maximum number of cycles that canbe used. An example would be creating a capped shared LPAR with an entitlement capacity of8 CPUs. That LPAR will not use more than 8 CPUs. If that LPAR uses only 2, the other 6 CPUscan be used by other uncapped shared LPARs.

    If an uncapped shared LPAR that has 8 CPUs entitled consumes all 8 CPUs, it can acquiremore resources from the shared pool, and use more than its entitled capacity. It can use up tothe number of online virtual processors that are defined for the LPAR.

    There are obvious throughput benefits to using an uncapped shared LPAR that can access theshared pool of processors. The LPAR must have enough virtual processors defined to takeadvantage of the idle processors in the shared pool.

    The following graph shows test results of two shared LPARS. One is a capped shared LPARwith an entitlement of 8 CPUs. The other is an uncapped shared LPAR with an entitlement of 8CPUs, and 16 virtual processors defined to take advantage of the 8 CPUs that are in the sharedprocessor pool.

  • IBM Informix on POWER7 Best Practices 16

    If you are using a dedicated LPAR the simplest way to test a shared LPAR is to change theLPAR mode from dedicated to shared uncapped, and make the number of virtual processorsand entitlement capacity equal to the number of CPUs that were allocated to the dedicatedLPAR. This test provides the immediate advantage of allowing unused CPU cycles to be usedby the shared processor pool.

    Virtual processors

    Virtual processors are similar to CPUs from an AIX operating system standpoint. That is, avirtual processor is a logical entity that is backed up by physical processor cycles. The numberof online virtual processors dictates the absolute maximum CPU consumption that an LPAR canachieve. If an LPAR has an entitlement of 2 CPUs, and you set up 4 virtual processors, theLPAR could consume up to 4 physical processors, in which case it would report a 200% CPUutilization.

    You can use the lparstat command to check the entitled capacity of the number of online virtualCPUs as well as other parameters for an LPAR. The following examples show three lparstatcommand outputs. The first output is for a dedicated LPAR with 8 CPUs assigned to it. Thesecond output is for a capped shared LPAR with 8 CPUs entitled, and the third output is for theshared LPAR moved to an uncapped shared LPAR with 8 CPUs entitled and 16 virtualprocessors.

  • IBM Informix on POWER7 Best Practices 17

    Use the following command:

    #lparstat l

    Dedicated (lparstat l)Node Name : v1009h01Partition Name : v1009h01-b3p019-informixPartition Number : 4Type : Dedicated-SMT-4Mode : CappedEntitled Capacity : 8.00Partition Group-ID : 32772Shared Pool ID : -Online Virtual CPUs : 8Maximum Virtual CPUs : 12Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : -Minimum Capacity : 1.00Maximum Capacity : 12.00Capacity Increment : 1.00Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : -Shared Physical CPUs in system : 0Maximum Capacity of Pool : 0Entitled Capacity of Pool : 0Unallocated Capacity : -Physical CPU Percentage : 100.00%Unallocated Weight : -Memory Mode : Dedicated

    Capped Shared LPAR (lparstat l)Node Name : v1009h02Partition Name : v1009h02-b3p019-informixPartition Number : 6Type : Shared-SMT-4Mode : CappedEntitled Capacity : 8.00Partition Group-ID : 32774Shared Pool ID : 1Online Virtual CPUs : 8Maximum Virtual CPUs : 12Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : 0Minimum Capacity : 0.10Maximum Capacity : 12.00Capacity Increment : 0.01Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : 16Shared Physical CPUs in system : 32Maximum Capacity of Pool : 1600

  • IBM Informix on POWER7 Best Practices 18

    Entitled Capacity of Pool : 1600Unallocated Capacity : 0.00Physical CPU Percentage : 100.00%Unallocated Weight : 0Memory Mode : Dedicated

    Uncapped Shared LPAR (lparstat l)Node Name : v1009h02Partition Name : v1009h02-b3p019-informixPartition Number : 6Type : Shared-SMT-4Mode : UncappedEntitled Capacity : 8.00Partition Group-ID : 32774Shared Pool ID : 1Online Virtual CPUs : 16Maximum Virtual CPUs : 16Minimum Virtual CPUs : 1Online Memory : 32768 MBMaximum Memory : 49152 MBMinimum Memory : 16384 MBVariable Capacity Weight : 128Minimum Capacity : 0.10Maximum Capacity : 16.00Capacity Increment : 0.01Maximum Physical CPUs in system : 128Active Physical CPUs in system : 128Active CPUs in Pool : 20Shared Physical CPUs in system : 41Maximum Capacity of Pool : 2000Entitled Capacity of Pool : 1600Unallocated Capacity : 0.00Physical CPU Percentage : 50.00%Unallocated Weight : 0Memory Mode : Dedicated

    Processor folding

    Processor folding is a method of turning off unused virtual processors so that they are notscheduled to run and consume CPU cycles. If an LPAR has 8 CPUs entitled and 10 virtualprocessors, but the LPAR only requires 2.5 CPUs for the current workload, it will run on just 3CPUs. The other 7 virtual processors are folded away, and when the workload dictates, thevirtual processors are used again (unfolded).

    To determine if processor folding is enabled, use the schedo command, and look for the currentsetting of the vpm_fold_policy parameter.

  • IBM Informix on POWER7 Best Practices 19

    schedo L

    NAME CUR DEF BOOT MIN MAX UNIT TYPEDEPENDENCIES

    . . . .

    --------------------------------------------------------------------------------vpm_fold_policy 1 1 1 0 15 D--------------------------------------------------------------------------------vpm_xvcpus 0 0 0 -1 2G-1 processors D--------------------------------------------------------------------------------

    We see in the output that the vpm_fold_policy current value is set to 1. This is a bitmasksetting, so a value of 1 indicates that folding is enabled if the LPAR is using shared processors.See the AIX documentation for all possible settings for this parameter.

    To disable processor folding regardless of the type of LPAR or power saving mode, set thevpm_fold_policy to 4 as shown in the following example.

    Example: schedo -p -o vpm_fold_policy=4

    The vpm_xvcpus parameter is used to determine the number of extra virtual processors tounfold when the system determines it needs to unfold a processor. For example, when theoperating system needs to unfold a processor, if vpm_xvcpus is set to 3, the operating systemunfolds 4 virtual processors.

    Example: schedo -p -o vpm_xvcpus=3

    RecommendationIf the workload requires consistent performance with stringent latency requirements, then such aworkload is best deployed on dedicated partitions rather than on a shared LPAR. In IBMInformix tests, a dedicated LPAR provided the most consistent performance.

    The exception to using a dedicated LPAR would be when a shared processor pool is not over-committed nor over-utilized.

    Use processor folding for more efficient use of the cores. However, disable folding if you see aproblem with processor folding, or if you see an excessive amount of folding. Processor foldingis dynamically configurable so test it in peak-load and low-load scenarios. If you choose to useprocessor folding, set the vpm_xvcpus parameter to 3. That setting helps avoid any penaltiesfrom unfolding one virtual processor at a time.

  • IBM Informix on POWER7 Best Practices 20

    Virtual I/O Server (VIOS) LPAR

    A Virtual I/O Server (VIOS) is a special LPAR that has additional software installed for thepurpose of managing the I/O for other LPARs. Instead of the individual network and diskresources being carved out on an LPAR by LPAR basis, the VIOS manages the disk andnetwork resources on behalf of the other LPARs. The size of the VIOS is important.

    RecommendationThe VIOS must be a dedicated LPAR, and it is recommended that you disable processor foldingfor the VIOS. The size of the VIOS server is important. Refer to AIX documentation for properconfiguration requirements. There is also a VIOS advisor that you can use, which can providerecommendations for your VIOS configuration.

    http://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor

  • IBM Informix on POWER7 Best Practices 21

    Additional LPAR recommendations

    Shared resource LPARs are very dynamic in nature. There are various performance tools thatcan be used to help improve allocation and placement of resources on the physical machineand within the LPARS.

    The Dynamic Platform Optimizer (DPO) is a PowerVM feature that you can use to improvepartition memory and processor affinity across the logical partitions in a Power Server. DPO isa feature that can help you reap performance gains for the IBM Informix server.

    Active System Optimizer (ASO) is a subsystem that is designed to automatically improve theperformance of AIX workloads running on POWER7. Dynamic System Optimizer (DSO) is builton the ASO framework and provides additional optimizations.

    RecommendationWe recommend that the System Administrator work with AIX and use DPO and ASO/DSO tooptimize workloads for Informix. See the following IBM Redbooks publication, IBM PowerVMVirtualization Managing and Monitoring, for details.

    http://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html

  • IBM Informix on POWER7 Best Practices 22

    Memory considerations

    Using larger virtual memory page sizes for an applications memory can significantly improve anapplications performance and throughput. The improvement in system performance stemsfrom the reduction of Translation Lookaside Buffer (TLB) misses due to the ability of the TLB tomap to a larger virtual memory range. Starting with the POWER4 processor, support for 16 MBlarge pages was introduced in addition to the default 4 KB pages. To use large pages onhardware where multiple page sizes are supported, run AIX 5LTM Version 5.3 updated with5300-04 Maintenance Package (or later.)

    Starting with version 11.50.xC4, Informix supports 16 MB pages. AIX does not automaticallyconfigure large pages in the environment. The system administrator must configure AIX touse these page sizes, and must specify the number of pages to be reserved. The number ofconfigured large pages will not be automatically changed by the operating system based ondemand.

    We will look at the 64 KB pages, which are dynamically allocated by the operating system on anas-needed basis, making them simpler to use than the 16 MB large page size. (Starting withPOWER5+TM hardware, huge 16 GB pages are also supported.)

    IBM Informix performed tests to compare the results of 4 KB page sizes, 64 KB page sizes, and16 MB page sizes. The results of these tests are discussed later in this section.

    RESIDENT parameter

    The RESIDENT parameter in the Informix configuration file ($ONCONFIG) needs to beconsidered with respect to memory considerations on AIX. For reference, here are thecomments from the onconfig.std file.

    #################################################################### Shared Memory Configuration Parameters#################################################################### RESIDENT - Controls whether shared memory is resident.# Acceptable values are:# 0 off (default)# 1 lock the resident segment only# n lock the resident segment and the next n-1# virtual segments, where n < 100# -1 lock all resident and virtual segments

    On AIX systems with a lot of allocated pinned resident memory, when Informix uses kernalasynchronous (KAIO) or direct I/O, Informix might experience KAIO read or write failures witherrno 22 (EINVAL).

  • IBM Informix on POWER7 Best Practices 23

    Example:

    04:30:40 KAIO: error in kaio_WRITE, kaiocbp = 0x22b620d0, errno= 2204:30:40 fildes = 258 (gfd 3), buf = 0x700000122b64000, nbytes= 4096, offset = 130785280

    Setting the RESIDENT configuration parameter to -1 is not recommended on AIX. As Informixallocates more memory, the database server attempts to pin that memory, and that results in ahigher likelihood of seeing an error. With Informix 11.70.FC6 and later, a warning message isdisplayed in the database server message log (online.log) if resident memory is used with KAIOor direct I/O.

    4 KB memory page size

    The default page size on AIX is 4 KB. The testing that IBM Informix has performed with varyingpage sizes of 64 KB and 16 MB are compared against that default.

    16 MB memory page size

    Before Informix can start to use large pages, the pages must be allocated by the SystemAdministrator. The following command example allocates 3072 large pages.

    vmo -p -o lgpg_regions=3072 o lgpg_size=16777216vmo -p -o v_pinshm=1

    This command can take a while to process. Monitor the number of allocated 16 MB pages withthe vmstat command. In the following output, there are 0 (avm) 16 MB pages active.

    vmstat -P ALL 5

    System configuration: mem=513536MB

    pgsz memory page----- -------------------------- ------------------------------------

    siz avm fre re pi po fr sr cy4K 54844752 9497936 799539 0 0 0 0 0 064K 594475 526063 68412 0 0 0 0 0 016M 16384 0 16384 0 0 0 0 0 0

    4K 54844752 9497948 789981 0 0 0 0 0 064K 594475 526062 68413 0 0 0 0 0 016M 16384 0 16384 0 0 0 0 0 0

    After the 16 MB pages are allocated, you must bring the Informix server offline, set theIFX_LARGE_PAGES environment variable, and then bring the instance back online.

    export IFX_LARGE_PAGES=1

  • IBM Informix on POWER7 Best Practices 24

    The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with16 MB AIX page size. The 4 KB test resulted in 255,491 TPM. The 16 MB test resulted in270,254 TPM. The use of 16 MB page size produced a 5.77% gain in performance.

    64 KB memory page size

    Starting with POWER5+ hardware, there is also support for 64 KB page sizes. The 64 KB pagesare dynamically allocated by the operating system on an as-needed basis, making the use of 64KB pages simpler because no pre-allocation has to occur.

    Take the following steps to enable 64 KB page sizes for Informix.

    1. Bring the Informix instance offline.

    2. Set the LDR_CNTRL environment variable.

    export LDR_CNTRL DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K@SHMPSIZE=64K

    3. Bring the Informix instance online.

    4. Unset the LDR_CNTRL environment variable.

    unset LDR_CNTRL

    The reason for unsetting the LDR_CNTRL environment variable is to avoid the unintended useof 64 KB pages for applications that might start from the same terminal.

  • IBM Informix on POWER7 Best Practices 25

    The following data is from a test performed by IBM Informix comparing 4 KB AIX page size with64 KB AIX page size. The 4 KB test resulted in 255,491 TPM, the 64 KB test resulted in271,052 TPM. The use of 64 KB page size produced a 6.09% gain in performance.

    The 64 KB test results show a better performance gain than the test results for 16 MB largepages. The benefits of the 16 MB large pages can become more evident as the size of thedatabase and memory usage grows for that database.

    RecommendationIBM Informix recommends using 64 KB page sizes. The simplicity of use, the dynamic nature,and results that are similar to that of 16 MB large pages drive this recommendation. In a verylarge database the larger 16 MB page sizes might produce better performance gains, but thisneeds to be tested on an individual basis.

    When using KAIO or direct I/O, do not set the RESIDENT configuration parameter to -1. Set it to0. Setting it to 1 or 2 might also be in order, but the System Administrator will need to monitorthe pinned memory to make sure that it does not exceed 80% of the physical memory on thecomputer or LPAR.

  • IBM Informix on POWER7 Best Practices 26

    Feedback Directed Program Restructuring (FDPR) Tool

    The FDPR tool for AIX is included with AIX 5L operating system V5.2 and later. FDPR is usedas a post-link utility for improving the performance of binaries that were compiled on the Powerfamily platform. It optimizes the binary to achieve a better hit/miss i-cache ratio, reduces thenumber of branches, and reduces TLB misses and page faults. This tool is useful for very largeprograms or those that used dynamically linked libraries.

    From the man page:

    The fdpr command (Feedback Directed Program Restructuring) is a performance-tuningutility that may help improve the execution time and the real memory utilization of user-level application programs. The fdpr program optimizes the executable image of aprogram by collecting information on the behavior of the program while the program isused for some typical workload, and then creating a new version of the program that isoptimized for that workload. The new program generated by fdpr typically runs faster anduses less real memory. Attention: The fdpr command applies advanced optimizationtechniques to a program which may result in programs that do not behave as expected;programs which are optimized using this tool should be used with due caution andshould be rigorously retested with, at a minimum, the same test suite used to test theoriginal program in order to verify expected functionality. The optimized program is notsupported.

    The following steps outline how to determine if the FDPR tool can optimize the oninitexecutable program that runs Informix. These steps do not describe the complete optimizationprocess, they only outline steps that you can use to test the optimization.

    1. Create a script to set the correct environment, run the oninit command, and run theworkload. FDPR expects to find the executable not running and in fact replaces it with aninstrumented version before startup. Because oninit is a setuid executable, and the SUIDinformation is not in the replacement executable, you must set the SUID and correct ownermask before the script starts oninit:

    chown root:informix ${INFORMIXDIR}/bin/oninitchmod 6755 ${INFORMIXDIR}/bin/oninit

    2. Make sure that Informix is fully configured for the benchmark, and that the data is alreadyloaded. The script should execute only the workload portion of the benchmark.

    3. Keep in mind that immediately after Informix starts, no data is cached, and so I/O activity ishigher than normal. Make sure that the run time is adjusted so that FDPR can see the productperforming for a considerable amount of time after the cache is warmed up.

  • IBM Informix on POWER7 Best Practices 27

    tpm0

    25000

    50000

    75000

    100000

    125000

    150000

    175000

    200000

    225000

    250000

    275000

    300000

    325000

    Using FDPR

    baselineoptimized

    4. Run FDPR in the same directory where the oninit executable file resides, or else FDPRcould experience problems renaming or re-linking the executable. If the workload script usesexternal drivers to run the benchmark, make sure that the drivers are in the execution path, andthat all output goes to the specified location.

    Next, run the following command, where "/tpcc/fdpr-workload" is the script that loads theRDBMS:

    ( cd $INFORMIXDIR/bin ; timex fdpr -p oninit -x /tpcc/fdpr-workload )

    5. Some versions of FDPR might lose information about the optimization level that is usedduring product build. The workaround for this is simple: Pass the optimization level to FDPR onthe command line (see man page for details).

    Below are the results of our benchmark comparing the original Informix (baseline) with theresults obtained after the oninit binary had been optimized by FDPR. The following steps weretaken. Points were selected from two sets (baseline, optimized) where throughput wasmeasured as a function of the number of user terminals. Both the baseline set and the one forthe FDPR-optimized binary were collected in a configuration that used 64 KB pages. Maximumthroughput in both sets was achieved with 32 active terminals.

    The following data is from IBM Informix test results using 64 KB page size and testing theresults on non-optimized versus FDPR optimized binary. The non-optimized binary produced270,526 TPM and the FDPR optimized binary produced throughput of over 300,000 TPM. Thisamounted to a 13% increase in performance.

  • IBM Informix on POWER7 Best Practices 28

    RecommendationIBM Informix recommends testing FDPR in a non-production environment using a productionload. Using FDPR can reap performance gains, and that tool should be tested thoroughlybefore being used in a production environment.

    For more information, see the following two documents: Feedback Directed ProgramRestructuring (FDPR) and AIX 5L Performance Tools Handbook (Redbook).

    https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/

    http://www.redbooks.ibm.com/abstracts/sg246039.html

  • IBM Informix on POWER7 Best Practices 29

    I/O subsystem

    The I/O subsystem is a key factor in a well-performing database server. A properly configuredI/O subsystem will allow maximum I/O throughput by the database server. A poorly configuredI/O subsystem can have major negative impacts on the database server. It might or might notbe obvious as to where a problem resides in a poorly performing database server.

    Read/Write access times

    A good read-and-write access time is determined in part by the technology of the storage that isbeing used. A typical I/O should take from 0 milliseconds (ms) to 15 ms. I/Os that take longerthan 15 ms might indicate a problem with the I/O system or a busy device. When a solid-statedrive (SSD) is used, the I/O will typically be less than 2-3 ms. Again, I/Os taking longer mightindicate a problem or a busy device. SSD might even produce results less than 1 ms, movinginto the microsecond range.

    You can monitor the access times from Informix or at the operating system level.

    In Informix, you would review the onstat g iof data and the onstat g ioh data. The onstat g iof data will show the chunks and a summary of the response times since the instance hasbeen online. In the following output, we see the average read service time for KAIO is 9.8 ms.

    onstat -g iof

    IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 23:09:01 --590776 Kbytes

    AIO global files:gfd pathname bytes read page reads bytes write page writes io/s3 rootdbs.1 6187008 3021 4007526400 1956800 823.1

    op type count avg. timeseeks 0 N/Areads 0 N/Awrites 0 N/Akaio_reads 2360 0.0098kaio_writes 212009 0.0011

    Informix also provides this output with a historical viewpoint going back one hour. This output isa better way to monitor I/O because it summarizes the data for the past hour on a per minutebasis.

    onstat -g ioh

    IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 00:03:56 -- 525240Kbytes

    AIO global files:

  • IBM Informix on POWER7 Best Practices 30

    gfd pathname bytes read page reads bytes write page writes io/s3 rootdbs.1 1073152 524 386072576 188512 457.1

    avg read avg writetime reads io/s op time writes io/s op time

    13:54:14 6 0.1 0.02417 1772 29.5 0.0018513:53:14 47 0.8 0.08274 2010 33.5 0.0014113:52:14 216 3.6 0.00214 378 6.3 0.00097

    From the operating system standpoint, you can use the iostat command to monitor I/Othroughput. The following iostat command will take two samples 5 seconds apart

    Example: iostat -D 5 2

    hdisk79 xfer: %tm_act bps tps bread bwrtn13.3 1.6K 0.4 1.6K 0.0

    read: rps avgserv minserv maxserv timeouts fails0.4 331.6 27.0 836.7 0 0

    write: wps avgserv minserv maxserv timeouts fails0.0 0.0 0.0 0.0 0 0

    queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull0.0 0.0 0.0 0.0 0.0 0.0

    KAIO/DIRECT_IO

    Kernel Asynchronous I/O (KAIO) is enabled by default and will be used for raw disk space. Itprovides performance gains over regular I/O. AIX also supports direct I/O & concurrent I/O forfile system access. Informix supports those types of file system access with the DIRECT_IOconfiguration parameter.

    AIX only supports concurrent I/O on JFS2 file systems. Direct I/O is similar to using KAIO for afile system. Concurrent I/O adds functionality by avoiding unnecessary write serialization. Forreference, here are the comments for the DIRECT_IO configuration parameter from theonconfig.std file.

    # DIRECT_IO - Specifies whether direct I/O is used for cooked# files used for dbspace chunks.# Acceptable values are:# 0 Disable# 1 Enable direct I/O# 2 Enable concurrent I/O

    To determine what type of I/O is being used, review the following output. To check on basicKAIO, run the onstat g ath command, and look for kaio threads. For example, the followingoutput shows 1 kaio thread.onstat -g ath|grep kaio

    18 60f0a360 0 3 IO Idle 1cpu* kaio

  • IBM Informix on POWER7 Best Practices 31

    To check for direct I/O or concurrent I/O run the onstat d command and look at the flags listedin column 5 for each chunk. A value of D represents direct I/O and C represents concurrentI/O.

    onstat -d

    IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 17 days 22:55:45 -- 590776 Kbytes

    .

    Chunksaddress chunk/dbs offset size free bpages flags pathname

    5ff081d0 1 1 0 5000000 4928731 PO-B-D/chunks/IDSPERF/rootdbs.1

    To disable KAIO completely, you can use the KAIOOFF environment variable. Prior to bringingthe Informix instance online, set KAIOOFF to 1.

    export KAIOOFF=1

    Queue depth

    From an application standpoint (database server), the length of time to do an I/O equals thetime to service the I/O plus the time that the I/O waits in the hard disk (hdisk) wait queue. Eachhdisk has an associated queue depth setting and, if this setting is poorly configured, it can havenegative impacts on I/O throughput. Use the lsattr command to check the current setting for adevice.

    lsattr -El hdisk6 |grep queuequeue_depth 16 Queue DEPTH True

    The faster the drive, the more I/O operations per second (IOPS) that a disk can handle. Themaximum throughput will be limited by the queue depth/average I/O service time. For example,a queue depth of 3 and an average I/O service time of 10 ms yield a maximum throughput of300 IOPS.

    You can use the iostat -D command to monitor the service times as well as the queue times. Ifyou start to see time spent waiting in the queue, you might want to increase the queue depth fora specific device.

    In the following iostat output, we see that we are spending an average of 3 ms in the queue,and had a max of 9 ms wait time in the queue.

  • IBM Informix on POWER7 Best Practices 32

    iostat -D 2 2

    System configuration: lcpu=128 drives=9 paths=8 vdisks=0

    hdisk6 xfer: %tm_act bps tps bread bwrtn63.5 5.0M 236.5 0.0 5.0M

    read: rps avgserv minserv maxserv timeouts fails0.0 8.0 0.0 18.0 0 0

    write: wps avgserv minserv maxserv timeouts fails236.5 5.3 1.1 71.0 0 0

    queue: avgtime mintime maxtime avgwqsz avgsqsz sqfull3.0 0.0 9.0 0.0 1.0 12.0

    To change the queue depth, use the chdev command.

    Example: chdev l hdisk66 a queue_depth=32

    For more information regarding queue depth, monitoring, and configuration see the followingarticle: AIX Disk Queue Depth Tuning for Performance.

    https://www-03.ibm.com/support/techdocs/atsmastr.nsf/WebIndex/TD105745

    AIO servers

    AIO is an AIX software subsystem that allows processes to issue I/O operations without waitingfor I/O to complete. This feature is particularly important in a database environment

    A kernel process (kproc), called an AIO server (AIOS), is in charge of each request from thetime that the request is taken off the queue until it completes. The number of servers limits thenumber of disk I/O operations that can be in progress in the system simultaneously. The defaultvalue of minservers is 3 and maxservers is 30. When more than 3 servers are needed, they willautomatically be allocated, up to the maxservers value.

    There is also an aio_server_inactivity tunable parameter that indicates the duration of inactivitybefore the inactive AIO servers are stopped. The stopping of these AIO servers can occurdown to the minservers value. The default value for aio_server_inactivity is 300 seconds.

    To view the current settings use the ioo command.

    # ioo -a |grep aio_

    aio_maxreqs = 8192aio_maxservers = 30aio_minservers = 3

    aio_server_inactivity = 300

    These defaults with AIX 6.1 can cause slow I/O, and the issue might not be easily identifiable.With these low defaults, a situation can occur where more than 3 AIO servers are needed to

  • IBM Informix on POWER7 Best Practices 33

    process a peak load of I/O activity. For example, during low I/O activity, the AIO servers can betaken down to the minimum of 3. But when a checkpoint occurs, more AIO servers arerequired. This situation can generate extra overhead each time the peak load of I/O activityoccurs, because starting the AIO servers incurs an extra cost.

    To change these parameters use the ioo command.

    Example:

    ioo -p -o aio_maxreqs=65536 -o aio_minservers=100 -o aio_server_inactivity=86400

    RecommendationIBM Informix recommends the use of KAIO for raw devices. For file system chunks, set theDIRECT_IO configuration parameter to 2. This setting permits direct I/O on file system chunksand allows for concurrent I/O on JFS2 file systems.

    Informix also recommends monitoring the queue depth, with a minimum setting of 16. Ifmonitoring shows wait times in the queue, increase the queue depth accordingly.

    Set the aio_minservers and aio_maxservers parameters to 100. Set the aio_server_inactivityparameter to 86400, which represents a 24-hour period of inactivity before any extra servers aretaken down. Also, set the aio_maxreqs parameter to 65536.

  • IBM Informix on POWER7 Best Practices 34

    Network subsystem

    TCP traffic

    The network layer can have performance implications based on the amount of data that needsbe sent across the network and whether the network is poorly configured. If you suspect that aslow network is the cause of a performance problem, you can use the following methods to testnetwork throughput.

    Run the slow-performing test or SQL locally, and compare the results against the same test orSQL that is run remotely, where the result set has to be returned over the network.

    Use scp or ftp commands to test throughput speed by sending a large file and monitoring itsthroughput. If the network is not producing the amount of throughput that is expected, makesure that the TCP window size is configured properly.

    The following TCP parameters can affect network performance: tcp_recvspace, tcp_sendspace,and rfc1323. The tcp_recvspace parameter specifies the number of bytes that the receivingsystem can buffer in the kernel. The tcp_sendspace parameter specifies the number of bytesthat the sending system can buffer in the kernel. The rfc1323 parameter enables the TCPwindow scaling option.

    Local loopback

    The fastpath loopback option is used to achieve better performance for loopback traffic. Thetcp_fastlo network parameter permits the TCP loopback traffic to reduce the distance for theentire TCP/IP stack to achieve better performance.

    To display the current setting for tcp_fastlo:

    no a |grep tcp_fastlo

    The tcp_fastlo parameter is disabled by default (value of 0). To set the parameter, use the nocommand. The p option applies the changes to both current and reboot values.

    no p o tcp_fastlo=1

    IBM Informix tests have shown an increase from ~350,000 TPM to ~520,000 TPM for localloopback testing. This is an increase of ~50%.

  • IBM Informix on POWER7 Best Practices 35

    RecommendationIBM Informix recommends contacting AIX Software Support to discuss any throughput issueswith the network. The following changes might increase throughput but should not be triedbefore discussing them with AIX Software Support.

    If network throughput becomes an issue, consider increasing the TCP window size to 256 KBwith the following commands. Consult AIX Software Support to discuss these changes.

    ifconfig en11 rfc1323 1 tcp_nodelay 1 tcp_sendspace 262144 tcp_recvspace 262144chdev -l en11 -a rfc1323=1 -a tcp_nodelay=1 -a tcp_sendspace=262144 a tcp_recvspace=262144 -P

    If using a local loopback connection, enable tcp_fastlo for performance improvements in theTCP loopback traffic.

    no p o tcp_fastlo=1

  • IBM Informix on POWER7 Best Practices 36

    Number of CPU virtual processors

    The question of how many CPU virtual processors (VPs) should be configured has been aninteresting problem on POWER7, specifically when SMT2 or SMT4 is used. The important thingto understand when sizing the number of CPU VPs on a system is that turning on SMT2 orSMT4 does not give you 2x or 4x CPU power, respectively. Configuring this number properlywith Informix is important, and the number depends on the type of work that the Informixdatabase will do. Is the workload more CPU intensive, or is the workload more I/O intensive?These are some of the questions that need to be understood to properly size the system.

    CPU-intensive workload

    If the Informix server is performing a more CPU-intensive workload, set the number of CPU VPsto 1.5x the number of physical CPUs allocated to the LPAR. On a POWER7 LPAR with 32cores and SMT4 enabled (128 logical CPUs), a good starting point is 48 CPU VPs. Use theVPCLASS configuration parameter to specify the number of CPU VPs that Informix will usewhen first bringing the database server online. For reference, here are the comments forVPCLASS from the onconfig.std file.

    # VPCLASS cpu - Configures the CPU VPs. The format is:# VPCLASS cpu, num=,

    VPCLASS cpu,num=48,noage

    Monitor the Informix engine to make sure that all 48 CPU VPs are needed, and decrease thenumber if necessary. To change the number of CPU VPs, you update the value of theVPCLASS configuration parameter, and the value takes effect after you stop and then start theInformix server. Use the onstat g glo command to determine if there are CPU VPs that arenot being used. In the following example, some of the latter CPU VPs have only clocked 5-6minutes of CPU time over nearly 58 days of being online. In that case, consider decreasing thenumber of CPU VPs.

    IBM Informix Dynamic Server Version 11.70.FC2 -- On-Line -- Up 57 days 23:42:02 --28512384 Kbytes

    Virtual processor summary:class vps usercpu syscpu totalcpu 72 20609362.66 1415471.47 22024834.13

  • IBM Informix on POWER7 Best Practices 37

    Individual virtual processors:vp pid class usercpu syscpu total Thread Eff1 9044394 cpu 1318982.55 126036.12 1445018.67 3471009.49 41%2 4980998 adm 2021.36 1038.05 3059.41 0.00 0%

    19 18808928 cpu 1513914.43 106858.67 1620773.10 3952966.80 41%20 6750704 cpu 1479419.71 106683.61 1586103.32 3858124.02 41%

    87 16515902 cpu 273.03 93.09 366.12 4967.90 7%88 11075584 cpu 251.17 91.13 342.30 4876.44 7%

    If, however, you determine that the Informix server is constantly seeing threads in the ready queuewaiting to run, consider increasing the number of CPU VPs.

    onstat -g rea

    IBM Informix Dynamic Server Version 12.10.FC1 -- On-Line -- Up 3 days 01:55:42 -- 558008 Kbytes

    Ready threads:tid tcb rstcb prty status vp-class name194655974 700000abab6b028 700000b7e0d9ae0 1 ready 32cpu sqlexec195234123 700000a71e78568 700000af4e67920 1 ready 36cpu sqlexec195317254 700000b2516e028 700000bd5045b70 1 ready 31cpu srvinfx195372610 700000ac7f422a0 700000af4e90bb0 1 ready 34cpu sqlexec195425354 700000aadeb4d20 700000a9015b8a0 1 ready 1cpu srvinfx195426222 700000b53cae0d0 700000b7e0d3a80 1 ready 32cpu scan_3.0

    Adding CPU VPs can be done dynamically with the onmode command. The following command adds 5CPU VPs.

    onmode p +5 cpu

    Keep in mind, threads 2 4 in SMT do not scale linearly, so although the total throughput will increase asthreads 2 4 are used, single thread response time might suffer. For this reason, when increasing thenumber of CPU VPs, test to find the best setting for your specific environment.

    I/O-intensive workload

    If the Informix server is performing a workload heavy on I/O, overload the CPU VPs a bit more.In the previous example with 32 cores (SMT4), 3x the physical cores, 96 CPU VPs is a goodplace to start.

    As described earlier, monitor the CPU clock time to determine if the VPs are over configured,and monitor the ready queue to see if more is needed. Also, monitor the user threads, andcheck to see if there are a lot of threads consistently waiting on I/O.

  • IBM Informix on POWER7 Best Practices 38

    onstat g ath|grep IO Wait

    194806496 700000aa4534568 700000af4e751f8 1 IO Wait 48cpu sqlexec194812270 700000af6fe3c68 700000b88f07108 1 IO Wait 49cpu sqlexec195228271 700000b21d4eb18 700000a9016d1b8 1 IO Wait 24cpu sqlexec195234152 700000ac13eab20 700000b88f17a10 1 IO Wait 26cpu sqlexec195288699 700000abfe7cb10 700000b7e0d3a80 1 IO Wait 39cpu sqlexec195293668 700000adf34d930 700000b848189d0 1 IO Wait 37cpu sqlexec

    If this is a consistent characteristic, increasing the number of CPU VPs might help performancethroughput.

    RecommendationMonitor the system workload:

    For CPU-intensive workloads, use a starting point for the number of CPU VPs at 1.5xthe number of physical CPUs in the LPAR

    For I/O-intensive workloads, use a starting point for the number of CPU VPs at 3x thenumber of physical CPUs in the LPAR

  • IBM Informix on POWER7 Best Practices 39

    Affinity

    The database server supports automatic binding of CPU virtual processors to a processor in amultiprocessor environment. The default behavior of affinity in Informix is to affinitize startingwith cpu0, then cpu1, cpu2, etc. On the POWER7 architecture, this is not necessarily the mostbeneficial behavior as this equates to logical cpu0, logical cpu1, logical cpu2, etc. WithPOWER7 architecture and SMT, it is more beneficial to use the first thread of each physicalCPU before using the 2nd threads of each physical CPU.

    By disabling affinity in Informix and allowing the operating system to schedule the CPU virtualprocessors, you will get the behavior of using the first thread for each core. This behavior is themost advantageous due to the throughput that is obtained by the first thread of each core. Asstated earlier in this paper, using threads 2-4 for a core will gain an additional 40% - 60%improvement in overall throughput.

    RecommendationOn the POWER7 architecture, if SMT2 or SMT4 is used, disable affinity. Using affinity candegrade performance due to the usage of threads 2-4 before using all the first threads for eachcore in the LPAR.

  • IBM Informix on POWER7 Best Practices 40

    Understanding onstat g glo

    The onstat g glo command is used to display information about the virtual processors withinInformix. One of the values that the command displays is a virtual processor efficiency value.This value is the ratio of the total CPU time to the total time that the threads ran on the virtualprocessor. This value shows efficiency utilization for the CPU virtual processor.

    Example:

    Individual virtual processors:vp pid class usercpu syscpu total Thread Eff19 18808928 cpu 1513914.43 106858.67 1620773.10 3952966.80 41%

    Threads were scheduled to run on this CPU VP for 3,952966 seconds, but the CPU VP only ranon the CPU for 1,620,773 seconds. The efficiency rating of 41% is derived by dividing 1620773by 3952966.

    To understand the efficiency rating on POWER7, it is necessary to understand the load on theserver and the number of physical and logical CPUs allocated to an LPAR. For example, anLPAR with 1 CPU allocated to it, with SMT4 with a load that would keep 4 CPU VPs maxed out,would show an efficiency rating of 25% for each of the 4 CPU VPs. Or all 4 CPU VPs added upwould approach 100%.

    This would not be a typical setup, and it is not recommended to have a 1-to-1 relation of CPUVPs with logical CPUs in an LPAR. This measurement will have more relevant meaning insystems not using SMT. The DBA would use the onstat g glo command along with mpstatand lparstat data to obtain information and understand CPU utilization.

    Monitoring lparstat can give you a general idea of how busy or idle the LPAR is.

    System configuration: type=Dedicated mode=Capped smt=4 lcpu=128 mem=513279MB

    %user %sys %wait %idle----- ----- ------ ------61.7 10.3 0.1 28.058.2 10.9 0.0 30.956.3 9.5 0.1 34.1

    In this output, the system is in the 30% range of being idle. The output also contains otherinformation about the LPAR. The LPAR is dedicated, using SMT4, and it has 128 logical CPUs.

  • IBM Informix on POWER7 Best Practices 41

    Monitoring the mpstat data is not straight forward to read and understand.

    cpu min maj mpc int cs ics rq mig lpa sysc us sy wa id pc0 1510 0 0 607 1720 280 1 4518 100 12509 74 16 0 9 0.461 273 0 0 391 656 60 0 556 100 4988 65 10 0 25 0.312 5 0 0 309 62 4 0 68 100 432 12 6 0 82 0.123 8 0 0 258 21 1 0 27 100 216 5 5 0 90 0.114 1438 0 0 444 1459 262 1 3772 100 11519 72 18 0 10 0.385 539 0 0 352 342 44 0 344 100 4875 83 5 0 11 0.426 1 0 0 263 46 3 0 41 100 218 17 6 0 77 0.107 0 0 0 265 49 2 0 32 100 340 14 8 0 78 0.10

    ALL 30648 0 0 49044 86261 12756 18 155654 100 641630 56 9 0 34 31.98

    The summary in the ALL line at the end of the mpstat data looks more like the lparstat data.Monitoring this output might show that the first thread #1 for each core is pretty active. LogicalCPU 0 is thread #1 for the first core in this LPAR. Logical CPU 4 is thread #1 for the secondcore in the LPAR.

    The output also shows that threads 2-4 are in use, but that they are not as heavily utilized asthread #1 for each core. As the number of CPU VPs becomes greater than the number of coresin use, threads 2-4 will begin to show more utilization.

    When threads 2-4 are being used, monitor throughput and response times to verify that thenumber of CPU VPs is properly configured with respect to the number of cores and logicalCPUs for the LPAR.

    RecommendationIf the number of CPU VPs is greater than the number of cores in an LPAR, monitor closely fortotal throughput compared to single-user response time. If response time degrades tounacceptable levels, test and monitor decreasing the number of CPU VPs to ensure betterresponse times. As is noted in other areas of this document, threads 2-4 for a core do not scalelinearly.

  • IBM Informix on POWER7 Best Practices 42

    Starting LPARs

    The order in which LPARs are started can affect the physical resource allocation within theLPARs and can have performance implications within a system. The first LPAR that is startedgets the most optimal resources, and the 2nd LPAR started gets the next best resources, etc.And the last LPAR that is started will get whats left.

    For example, assume that you have a 32 core system with four chips, each with eight cores. Iffive partitions are configured, each with six cores, the first four LPARs would be located on eachchip and the fifth LPAR would be spread across three chips.

    This is especially true if the LPARs resources that are allocated are not greater than the coreson a single chip, in which case there is a better opportunity for them to obtain good affinitycharacteristics in their core and memory allocations.

    RecommendationThe order in which LPARs are started should be considered in obtaining the best performancefor high-priority workloads. Start the most important partitions first to obtain the best resourcesfrom a single chip.

  • IBM Informix on POWER7 Best Practices 43

    Appendix A: Recommendations summary

    Summary table of recommendations made throughout this whitepaper. For completerecommendations see the appropriate section in the white paper.

    SMT settings Use SMT4 for increased overall throughput. Use SMT2 whensingle-thread response time is more important. See the SMT1vs SMT2 vs SMT4 section for details.

    LPAR type Use a dedicated LPAR where possible. See the DedicatedLPAR vs shared LPAR section for details.

    VIOS Use a dedicated LPAR. See the Virtual I/O Server (VIOS)LPAR section for details.

    Additional LPAR improvements When possible, use DPO and ASO/DSO to optimize workloadsfor Informix. See the Additional LPAR recommendationssection for details.

    Memory Page Sizes When using KAIO or direct I/O, do not set RESIDENTconfiguration parameter to -1. Use 64 KB large pages forperformance improvements. See the Memory considerationssection for details.

    FDPR Use FDPR in a non-production environment to test forperformance improvements. See the Feedback DirectedProgram Restructuring (FDPR) section for more details.

    IO Subsystem Use KAIO or DIRECT IO where possible. For disks, set thequeue depth to a minimum of 16. For AIO servers set the minand max aio servers to 100 and aio_server_inactivity to 86400.See the I/O subsystem section for details.

    Network Subsystem Test network throughput and as tune as needed. Increasetcp_recvspace and tcp_sendspace AIX parameters up to 256KB. See the Network Subsystem section for details.

    Number of CPU VPs For CPU-intensive workloads, set CPU VPs at 1.5x the numberof physical CPUs in the LPAR. For I/O-intensive workloads, setthe CPU VPs to 3.x the number of CPUs in the LPAR. See theNumber of CPU virtual processors section for details.

    Affinity Disable affinity when SMT2 or SMT4 is being used. See theAffinity section for details.

    Interpreting onstat -g glo If the number of CPU VPs is greater than the number of coresin an LPAR, monitor closely the throughput, and adjust thenumber of CPU VPs as appropriate. See the Interpretingonstat -g glo section for details.

    Starting LPARs Start the most critical LPARs first. See the Starting LPARssection for details.

  • IBM Informix on POWER7 Best Practices 44

    Appendix B: Useful commands

    amepat

    Active MemoryTM Expansion Planning and Advisory Tool. The amepat command reports ActiveMemory Expansion information and statistics as well as provides an advisory report that assistsin planning the use of Active Memory Expansion for an existing workload. This document usedthis tool to show statistics for an LPAR.

    bosboot

    Creates a boot image. This utility is used to reboot an LPAR with any setting changes that havebeen made.

    chdev

    Changes the characteristics of a device. This document used this tool to modify the queuedepth for a hard disk, as well as changing TCP settings for a network interface.

    ifconfig

    Configures or displays network interface parameters for a network using TCP/IP. Thisdocument used this tool to make modifications to TCP parameters.

    ioo

    Manages Input/Output tunable parameters. This document used this tool to view the AIO serversettings as well as to modify some of the parameters.

    iostat

    Reports Central Processing Unit (CPU) statistics, asynchronous input/output (AIO) andinput/output statistics for the entire system, adapters, TTY devices, disks CD-ROMs, tapes, andfile systems. This document used this tool to monitor I/O statistics.

    lparstat

    Reports logical partition (LPAR) related information and statistics. This document used this toolto gather and show statistics for an LPAR.

  • IBM Informix on POWER7 Best Practices 45

    lsattr

    Displays attribute characteristics and possible values of attributes for devices in the system.This document used this tool to monitor the queue depth for a hard disk.

    schedo

    Manages processor scheduler tunable parameters. This document used this tool to check theprocessor folding settings and modify the settings as needed.

    smtctl

    Controls the enabling and disabling of processor simultaneous multithreading mode. Thisdocument used this tool to enable/disable SMT, and set the mode accordingly.

    vmo

    Manages Virtual Memory Manager tunable parameters. This document used this tool to set upand pre-allocate memory for 16 MB large pages.

    vmstat

    Reports virtual memory statistics. This document used vmstat to monitor the creation of 16 MBlarge pages.

  • IBM Informix on POWER7 Best Practices 46

    Appendix C: Additional reading

    The following list of links provides useful background information for PowerVM and POWER7systems. Also listed is 3rd-party material, which is not an endorsement of the material by IBM,but is meant to offer the reader a variety of information and viewpoints.

    DeveloperWorks: AIX Virtual Processor Folding is misunderstoodhttps://www.ibm.com/developerworks/community/blogs/aixpert/entry/aix_virtual_processor_folding_in_misunderstood110?lang=en

    IBM Systems: Understanding Micro-Partitioninghttp://www.ibmsystemsmag.com/aix/tipstechniques/systemsmanagement/Understanding-Micro-Partitioning/?page=1

    IBM Systems: Getting a handle on Entitled Capacity & Virtual Processorshttp://www.ibmsystemsmag.com/aix/administrator/systemsmanagement/entitled_capacity/

    YouTube: Power7 Performance Entitlement, VPs, Affinity, Memoryhttp://www.youtube.com/watch?v=1W1M114ppHQ

    Feedback Directed Program Restructuring (FDPR)https://www.research.ibm.com/haifa/projects/systems/cot/fdpr/

    Developer Works: VIOS Advisorhttp://www.ibm.com/developerworks/wikis/display/WikiPtype/VIOS+Advisor

    IBM Redbooks Publication: IBM PowerVM Virtualization Managing and Monitoringhttp://www.redbooks.ibm.com/redpieces/abstracts/sg247590.html

    IBM Redbooks Publication: AIX 5L Performance Tools Handbookhttp://www.redbooks.ibm.com/abstracts/sg246039.html

  • IBM Informix on POWER7 Best Practices 47

    Appendix D: References

    IBM Informix database server 12.10.xC1: A Technical White Paper Release Notes for IBM Informix 12.10.xC2 IBM Informix 12.10 .NET Provider Reference Guide IBM Informix 12.10 Administrators Reference IBM Informix 12.10 Backup and Restore Guide IBM Informix 12.10 Database Extensions User's Guide IBM Informix 12.10 Enterprise Replication IBM Informix 12.10 GLS User's Guide IBM Informix 12.10 Guide to SQL: Reference IBM Informix 12.10 Guide to SQL: Syntax IBM Informix 12.10 Migrating and upgrading IBM Informix 12.10 Performance Guide IBM Informix 12.10 Security IBM Informix 12.10 TimeSeries Data Users Guide IBM Informix 12.10 Warehouse Accelerator Administration Guide

  • IBM Informix on POWER7 Best Practices 48

    For more information

    To learn more about the Informix features, contact your IBM representative or IBM BusinessPartner, or visit ibm.com/software/data/informix

    IBM Informix on POWER7Best PracticesA Technical White Paper

    December 2013Darin TracyMonish GuptaVladimir Kolobrodov

    Copyright 2013 IBM Corporation

    IBM CorporationSoftware GroupRoute 100Somers, NY 10589U.S.A.

    The information contained in this publication is providedfor informational purposes only. While efforts were madeto verify the completeness and accuracy of theinformation contained in this publication, it is provided ASIS without warranty of any kind, express or implied. Inaddition, this information is based on IBMs currentproduct plans and strategy, which are subject to changeby IBM without notice.

    IBM shall not be responsible for any damages arising outof the use of, or otherwise related to, this publication orany other materials. Nothing contained in this publicationis intended to, nor shall have the effect of, creating anywarranties or representations from IBM or its suppliers orlicensors, or altering the terms and conditions of theapplicable license agreement governing the use of IBMsoftware.

    References in this publication to IBM products, programs,or services do not imply that they will be available in allcountries in which IBM operates. Product release datesand/or capabilities referenced in this presentation maychange at any time at IBMs sole discretion based onmarket opportunities or other factors, and are notintended to be a commitment to future product or featureavailability in any way. Nothing contained in thesematerials is intended to, nor shall have the effect of,stating or implying that any activities undertaken by youwill result in any specific sales, revenue growth, savingsor other results.

    IBM, the IBM logo, ibm.com, and Informix are trademarksof International Business Machines Corp., registered inmany jurisdictions worldwide. Other product and servicenames might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Webat Copyright and trademark information atwww.ibm.com/legal/copytrade.shtml. Linux is aregistered trademark of Linus Torvalds in the UnitedStates, other countries, or both.