technical white paper running linux on hp integrity ...€¦ · technical white paper . running...

22
Technical white paper Running Linux on HP Integrity Superdome X Delivering HP Project Odyssey with a scalable x86 server Table of contents HP Integrity Superdome X with HP BL920s Gen8 Server Blades....................................................................................2 Benchmark results ............................................................................................................................................................2 Hardware overview ...............................................................................................................................................................3 Blade building block ..........................................................................................................................................................4 16-socket configuration...................................................................................................................................................6 Hardware differentiators .................................................................................................................................................7 Maximum configurations .................................................................................................................................................8 Software overview ................................................................................................................................................................9 Operating systems ............................................................................................................................................................9 HP BL920s support bundle ..............................................................................................................................................9 Software Delivery Repository ..........................................................................................................................................9 Virtualization solutions.....................................................................................................................................................9 Linux Serviceguard feature ........................................................................................................................................... 10 Recommended hardware configuration ......................................................................................................................... 10 Server .............................................................................................................................................................................. 10 Recommended software configuration and tuning....................................................................................................... 11 Operating system........................................................................................................................................................... 11 SAP HANA ........................................................................................................................................................................ 15 Software advisory notes ................................................................................................................................................... 15 Finding Customer Advisories ........................................................................................................................................ 15 General known issues.................................................................................................................................................... 15 SLES 11 SP3 known issues ........................................................................................................................................... 16 RHEL 6.5/6.6 and RHEL 7.0 known issues.................................................................................................................. 17 Summary ............................................................................................................................................................................. 21 Resources............................................................................................................................................................................ 22

Upload: others

Post on 19-May-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Technical white paper

Running Linux on HP Integrity Superdome X Delivering HP Project Odyssey with a scalable x86 server

Table of contents HP Integrity Superdome X with HP BL920s Gen8 Server Blades .................................................................................... 2

Benchmark results ............................................................................................................................................................ 2 Hardware overview ............................................................................................................................................................... 3

Blade building block .......................................................................................................................................................... 4 16-socket configuration ................................................................................................................................................... 6 Hardware differentiators ................................................................................................................................................. 7 Maximum configurations ................................................................................................................................................. 8

Software overview ................................................................................................................................................................ 9 Operating systems ............................................................................................................................................................ 9 HP BL920s support bundle .............................................................................................................................................. 9 Software Delivery Repository .......................................................................................................................................... 9 Virtualization solutions ..................................................................................................................................................... 9 Linux Serviceguard feature ........................................................................................................................................... 10

Recommended hardware configuration ......................................................................................................................... 10 Server .............................................................................................................................................................................. 10

Recommended software configuration and tuning ....................................................................................................... 11 Operating system ........................................................................................................................................................... 11 SAP HANA ........................................................................................................................................................................ 15

Software advisory notes ................................................................................................................................................... 15 Finding Customer Advisories ........................................................................................................................................ 15 General known issues .................................................................................................................................................... 15 SLES 11 SP3 known issues ........................................................................................................................................... 16 RHEL 6.5/6.6 and RHEL 7.0 known issues .................................................................................................................. 17

Summary ............................................................................................................................................................................. 21 Resources ............................................................................................................................................................................ 22

Technical white paper | Running Linux on HP BL920s Gen8

HP Integrity Superdome X delivers the promise of Project Odyssey HP Project Odyssey revolutionizes mission-critical computing with industry-leading availability, exceptional performance, uncompromising reliability, and investment protection centered on HP Converged Infrastructure. The Integrity Superdome X is a scalable platform of HP BL920s Gen8 Server Blades based on the X86 architecture that deliver on that promise. This paper explains the best ways to run Linux® on the BL920s Gen8 server blades.

HP Integrity Superdome X with HP BL920s Gen8 Server Blades

As a general-purpose server, the BL920s Gen8 server blade supports the following Linux distributions:

• Red Hat® Enterprise Linux (RHEL) 6.5 with the required maintenance update kernel (June 24, 2014 or later)

• RHEL 6.6 and RHEL 7.0 releases

• SUSE Linux Enterprise Server (SLES) 11 SP3 with the required BL920s Gen8 server blade BigSMP Bootable Driver Kit

These distributions take maximum advantage of the BL920s Gen8 server blade as part of the HP Integrity Superdome X, because HP has worked with Red Hat and SUSE to improve the reliability and scalability aspects of the Linux kernel. We have also tested complete solution stacks on top of those distributions and have characterized the performance of a variety of workloads.

The BL920s Gen8 server blade is also at the core of the HP ConvergedSystem 900 for SAP® HANA appliance. This appliance is a complete solution for running SAP HANA, the in-memory data platform to perform advanced, real-time analytics while simultaneously handling real-time transaction workloads. The scalable HP hardware exploits the full power of the HANA software—it is the largest SAP-certified solution for SAP HANA platform-based workloads, helping clients reduce costs and improve response time to business demands.

Benchmark results The significant power of the BL920s Gen8 server blade platform is proved through remarkable benchmark results. In June 2014 the HP ConvergedSystem 900 for SAP HANA captured the top two spots for SPECjbb2013-MultiJVM (Java server benchmark)—the Standard Performance Evaluation Corporation (SPEC) benchmark for evaluating the performance of server-side Java. The 16-socket, 12 TB configuration achieved a max-jOPS number more than twice as large as that of any other published system. In Nov 2014 these results were refreshed and the 16-socket system was the first x86 server to break 1-million max-jOPS. In Dec 2014 the HP Integrity Superdome X SPECjbb2013-MultiJVM results were published capturing the #1 max-jOPS spots for x86 4-socket, 8-socket, and 16-socket servers. HP Integrity Superdome X also captured the #1 8-socket x86 server position and the #1 overall 16-socket leader for the SPECcpu2006 benchmarks SPECfp_rate_base2006, SPECfp_rate2006, SPECint_rate_base2006, and SPECint_rate2006. The following table gives the pointers to the leadership numbers published on the spec.org website.

2

Technical white paper | Running Linux on HP BL920s Gen8

Table 1. Pointers to leadership SPECjbb2013-MultiJVM and SPECcpu2006 benchmark results

Configuration Benchmark Link

HP ConvergedSystem 900 for SAP HANA

SPECjbb2013-MultiJVM

Jun 2014

hp.com/V2/GetDocument.aspx?docname=4AA5-3288ENW&cc=us&lc=en

HP ConvergedSystem 900 for SAP HANA

SPECjbb2013-MultiJVM

Nov 2014

hp.com/V2/GetDocument.aspx?docname=4AA5-5983ENW&cc=us&lc=en

HP Integrity Superdome X

SPECjbb2013-MultiJVM

Dec 2014

hp.com/V2/GetDocument.aspx?docname=4AA5-6149ENW&cc=us&lc=en

HP Integrity Superdome X

SPECcpu2006

Dec 2014

hp.com/V2/GetDocument.aspx?docname=4AA5-6142ENW&cc=us&lc=en

The BL920s Gen8 server blade powers the HP ConvergedSystem 900 for SAP HANA referenced in the above benchmark results.

SPEC and the benchmark names SPECjbb and SPECcpu are registered trademarks of the Standard Performance Evaluation Corporation. See spec.org/jbb2013/ and spec.org/cpu2006. The stated HP Results are published on spec.org as of December 1, 2014.

Hardware overview

The basic building block of BL920s Gen8 server blade is a two-socket blade that can be installed in an HP BladeSystem Superdome enclosure. That enclosure has the capacity to accommodate eight blades. Those eight blades can optionally be partitioned into two servers of four blades each. Those two servers are logically and electrically isolated from each other, so they can function independently. That partitioning flexibility allows the server to be right sized for any workload.

Note See QuickSpecs for HP Integrity Superdome X for more detailed information about hardware configurations and specifications

3

Technical white paper | Running Linux on HP BL920s Gen8

Blade building block The conceptual block diagram of the computing resources in the BL920s Gen8 server blade is shown in Figure 1. This simplified diagram shows the blade I/O, the processor sockets, and the memory.

Figure 1. Conceptual block diagram of the BL920s Gen8 server blade

Each blade supports two different types of removable I/O cards: mezzanine slots and LAN on motherboard (LOM) modules. The three mezzanine slots can accept PCI host bus adapters. Two LOM modules can be installed on each blade.

Processors BL920s Gen8 server blades support Intel® Xeon® Processor E7 v2 Family devices as specified in the following Table 1.

Table 1.

BL920s Gen8 server blade supported CPU Matrix

Intel® Xeon® Processor E7 v2 Family

Processor Cores per processor

Frequency Cache Power

Intel Xeon Processor

E7-2890 v2

15c 2.8 GHz (3.4 GHz Turbo) 37.5 MB 155W

Intel Xeon Processor

E7-2880 v2

15c 2.5 GHz (3.1 GHz Turbo) 37.5 MB 130W

Intel Xeon Processor

E7-4830 v2

10c 2.2 GHz (2.7 GHz Turbo) 20 MB 105W

Intel Xeon Processor

E7-8891 v2

10c 3.2 GHz (3.7 GHz Turbo) 37.5 MB 155W

4

Technical white paper | Running Linux on HP BL920s Gen8

Memory There are 24 DIMM slots associated with each socket, for a total of 48 slots per blade. The 48 slots are arranged as 16 channels of three DIMM positions each. The DIMMs must be loaded in groups of 16 on the blade where you have 16 DIMMs, 32 DIMMs or 48 DIMMs per blade. The DIMM slots can accept DDR3 memory DIMMs with capacity of 16 GB or 32 GB. Thus, the max for the memory configuration is 1.5 TB per blade using 32 GB DIMMs.

I/O adapters These I/O adapters have been qualified to HP’s rigorous quality standards. More adapters will be available in the future.

Table 2. I/O adapters supported in BL920s Gen8 server blade

Type Product Part number

Mezzanine

Fibre Channel adapter

HP QMH2672 16Gb FC HBA 710608-B21

Mezzanine HP Ethernet 10Gb 2-port 560M Adapter 665246-B21

Mezzanine

Ethernet adapter

HP FlexFabric 10Gb 2P 534M Adapter 700748-B21

LOM

Ethernet adapter

HP Ethernet 10Gb 2-port 560FLB Adapter 655639-B21

Flex LOM

Ethernet Adapter

HP FlexFabric 10Gb 2P 534FLB Adapter 700741-B21

Note The HP Flex Fabric 10GB 2P 534M Adapters (Mezzanine and LOM) only support networking.

A 16-socket system can be configured with between one and eight Fibre Channel (FC) cards; the adapters must be installed in mezzanine slot 2. There can be between eight and 16 network LOM adapters—one is required in LOM slot 1 and another is optional in LOM slot 2. It is also possible to add up to as many as additional eight network mezzanine cards, installed in mezzanine slot 1, if so desired.

An eight-socket system (four-blade) has similar configuration constraints.

Table 3. Interconnect modules supported in BL920s Gen8 server blade

Type Product Part number

Fibre Channel interconnect Brocade 16Gb/16 SAN Switch for BladeSystem c-Class

Brocade 16Gb/28 SAN Switch Power Pack+ for BladeSystem c-Class

C8S45A

C8S47A

Ethernet interconnect HP 10GbE Ethernet Pass-Thru Module for c-Class BladeSystem 538113-B21

Ethernet interconnect HP 6125XLG Ethernet Blade Switch 711307-B21

One or two Fibre Channel interconnects may be installed in bays 5 and 6 in the back of the enclosure. One of the Ethernet interconnects must be installed in bay 1; a second Ethernet interconnect of the same type may be installed in bay 2, if desired. If mezzanine adapters are used, then the corresponding interconnect modules of the same type can be installed in bays 3 and 4.

See the configuration options table in the BL920s Gen8 server blade system QuickSpecs document for more information: hp.com/h20195/v2/GetDocument.aspx?docname=c04383189.

5

Technical white paper | Running Linux on HP BL920s Gen8

16-socket configuration Eight blades installed into an HP Integrity Superdome X enclosure will result in the server structure shown in Figure 2.

Figure 2. Conceptual block diagram of eight BL920s Gen8 server blades

The BL920s Gen8 server blade can be partitioned into a pair of eight socket servers. For best performance, configure blades 1/1, 1/3, 1/5, and 1/7 as one server, and blades 1/2, 1/4, 1/6, and 1/8 as another independent server.

You can see that the BL920s Gen8 server blade has a modular structure. The most basic building block is the processor and its associated socket local memory. Each processor has an embedded memory controller, through which it has extremely fast access to its local memory. The processors are combined in pairs, connected by Intel® QuickPath Interconnect (QPI) links. Communication among blades is facilitated through the HP Interconnect Fabric. The fabric is built upon the HP 3000 chipset, a new version of the chipset that formed the backbone of Superdome 2 servers.

This structure gives the BL920s Gen8 server blade a non-uniform memory architecture (NUMA)—the latency time for any given processor to access memory depends on the relative positioning of the processor socket and the memory DIMM. The time it takes for a memory transaction to traverse through the interconnect fabric is somewhat longer than the fast access to socket local memory. This fact is important when tuning for optimal memory placement, as described in the NUMA considerations section.

6

Technical white paper | Running Linux on HP BL920s Gen8

Hardware differentiators The BL920s Gen8 server blade platform offers a variety of features that make it superior to other server platforms in terms of reliability, availability, and serviceability (RAS), and scalability features.

Flexible scalability The BL920s Gen8 server blade can be ordered from the factory as a system with eight sockets (four blades) or 16 sockets (eight blades).

Unified Extensible Firmware Interface The BL920s Gen8 server blade is a pure Unified Extensible Firmware Interface (UEFI) implementation. As such, it overcomes the limitations of the older BIOS approach—it is able to boot from extremely large disks, and that boot process is faster due to the larger block transfers.

HP Onboard Administrator Onboard Administrator (OA) is a standard component of HP systems that provides complex health and remote complex manageability. OA includes an intelligent microprocessor, secure memory, and a dedicated network interface, and it interfaces with the HP Integrated Lights-Out (iLO) subsystem on each server blade. This design makes OA independent of the host blade and operating system. OA provides remote access to any authorized network client, sends alerts, and provides other blade management functions.

Using Onboard Administrator, you can:

• Remotely power up, power down, or reboot partitions

• Send alerts from the OA regardless of the state of the complex

• Access advanced troubleshooting features through the OA interface

• Diagnose the complex through a Web browser

For more information about OA features, see the BL920s Gen8 server blade Onboard Administrator user guide.

Firmware-First PCIe error containment and recovery BL920s Gen8 server blade platforms facilitate the containment of errors in the I/O subsystem, through the Firmware-First PCIe error handling, PCI Advanced Error Reporting (AER), and Live Error Recovery (LER) mechanisms. I/O errors that might cause data corruption on other systems are safely isolated by BL920s Gen8 server blade hardware and firmware. In cases of transient I/O errors, the error containment feature allows Linux to continue operating. HP is continually working with I/O adapter vendors to increase the number of error cases that can be successfully recovered.

LER is a hardware feature of Intel high-end x86 processors to enable error containment and recovery of PCIe errors that would otherwise cause system crashes and potential data corruption. Recovery from PCIe uncorrectable errors is important on BL920s Gen8 server blade because it has many more PCIe devices than most Linux servers. The potential for hardware errors increases proportionally to the number of PCIe devices. Online recovery from such errors is increasingly critical to HP’s enterprise customers, especially those who run large I/O deployments.

The Linux kernel running on BL920s Gen8 server blade uses the AER subsystem to manage error reporting and I/O device recovery. The BL920s Gen8 server blade firmware enhances AER with better error containment and higher chance of recovery by leveraging the LER processor feature and robust I/O driver recovery support.

Containment The biggest concern when an I/O error occurs is preventing data corruption. Data corruption can occur whether the device recovers or not. If a device fails in the middle of a large file transfer and an error occurs, the device could potentially write corrupt values to the file before the driver is notified of the problem. Using the Firmware-First mechanism and Intel’s LER processor feature, errors are contained immediately before corruption can occur, because all reads and writes are halted until the driver is notified and can take the appropriate action.

Recovery Our goal is for the system to handle PCIe I/O errors with no intervention from the user and to allow workloads to continue without interruption. Suppose a large file is being transferred over the network device via FTP and an error occurs on the network device. The file transfer will pause momentarily as the device recovers from the error and then resume. The completed file transfer will be intact and free of corruption.

Recovery of a specific I/O device will be largely dependent on the capability of its driver. Note that all supported drivers on the BL920s Gen8 server blade have some level of PCIe I/O error recovery enabled. A large percentage of PCIe I/O errors will be recovered and normal operation will resume. There is a subset of errors and configurations where recovery will not

7

Technical white paper | Running Linux on HP BL920s Gen8

occur. In these cases, the system may panic or the I/O device will become de-configured. As PCIe I/O errors in general are unlikely and the situation where they are not recoverable is even less likely, the odds of the server going down or losing functionality is relatively low. HP is working closely with I/O vendors and the Linux community to continue PCIe I/O error recovery enablement, and as drivers mature, more PCIe I/O errors will be recoverable.

Error logging Even though PCIe I/O errors rarely happen, it is important for the user to be notified when they occur. If a specific device continually experiences errors, it might be an indication that the device is faulty and needs to be replaced.

There are a number of logging mechanisms that can be used to identify I/O errors.

Core Analysis Engine (CAE)—CAE runs on the Onboard Administrator, which moves platform diagnostic capabilities to the firmware level, so it can drive self-healing actions and report failures even when the OS is unable to boot. It also provides possible reasons for failures as well as possible actions for repair and detailed replacement part numbers. This speeds recovery to reduce downtime.

Linux AER trace events—HP has worked with the Linux open source community to provide an enhanced mechanism to report errors using kernel trace events. When a PCI error occurs, an event is triggered to notify any user space applications programmed to intercept the event. Trace events also provides a user-readable log providing details about the error.

Syslog—The user can examine Linux system logs to determine that an error occurred and diagnose the problem. The syslog has output from the kernel and drivers about the error.

OA live logs—If the system goes down and the OS is not available, the user can use the OA system event log (SEL) and forward progress log (FPL) to see if an error occurred. These errors are also sent to remote management systems like HP Insight Remote Support (IRS). See hp.com/us/en/business-services/it-services.html?compURI=1078312.

Double Device Data Correction Double Device Data Correction + 1 “bit” (DDDC + 1), also known as double-chip sparing, is a robust and efficient method of memory sparing.

It can detect and correct single- and double-DRAM device errors for every x4 DIMM in the server. By reserving one DRAM device in each rank as a spare, it enables data availability after hardware failures with any x4 DRAM devices on any DIMM. In the unlikely occurrence of a second DRAM failure within a DIMM, the platform firmware alerts you to replace that DIMM gracefully without a system crash. The additional 1-bit protection further helps correct single-bit errors (e.g., due to cosmic rays) while a DIMM has entered dual-device correction mode and is tagged for replacement. DDDC+1 provides the highest level of memory RAS protection with no performance impact and no reserved memory in Intel Xeon E7-based servers; the full system memory is available for use.

Maximum configurations

Table 4. Maximum configurations for BL920s Gen8 server blades

16 socket 12 socket 8 socket 4 socket 2 socket

Processor cores

(15 cores per socket)

240 180 120 60 30

Logical processors

(Hyper-Threading on)

480 360 240 120 60

Memory capacity

(32 GB DIMMs)

12 TB 9 TB 6 TB 3 TB 1.5TB

Mezzanine slots

(Current supported limit)

24 (16) 18(9) 12 (8) 6 (4) 3(2)

LOM 16 12 8 4 2

8

Technical white paper | Running Linux on HP BL920s Gen8

Software overview

Operating systems BL920s Gen8 server blade supports these Linux distributions:

• RHEL 6.5 with maintenance update kernel (June 24, 2014 or later). It is required to use RHEL 6.5.z kernel-2.6.32-431.20.3.el6 (or any later version). This update is available from the Red Hat Network. Please see RHSA-2014-0771 for details.

• RHEL 6.6 and RHEL 7.0

• SUSE Linux Enterprise Server (SLES) 11 SP3 with ProLiant Gen8 BigSMP Bootable Driver Kit. It is required to use kernel-bigsmp-3.0.101-0.30.1.x86_64 (or any later version). Information about this Bootable Driver Kit is available at drivers.suse.com/hp/HP-ProLiant-Gen8-BigSMP/1.0/sle-11-sp3-x86_64/install-readme.html.

HP BL920s support bundle The BL920s Gen8 server blade servers support bundle contains an ISO image including the system I/O firmware, System Management Homepage (SMH), and Web-Based Enterprise Management (WBEM) providers for the BL920s Gen8 server blade servers running SLES 11 SP3. The ISO image also includes the HP Smart Update Manager (SUM) 6.4.1 application that can be used to update the I/O firmware, SMH, and WBEM providers. Please install current version of WBEM providers for SLES from the HP Software Delivery Repository.

This support bundle is used only in offline mode for updating I/O card firmware. It does not contain the WBEM providers; they are available from the HP Software Delivery Repository (SDR).

Software Delivery Repository The SDR provides yum and apt repositories for Linux-related software packages. See the white paper, “Using HP Service Pack for ProLiant (SPP) and Software Delivery Repository (SDR)” for more information.

RHEL Download the WBEM Providers using the directory listed for each RHEL distribution:

• RHEL 6.5 WBEM Providers: downloads.linux.hp.com/SDR/repo/bl920-wbem/rhel/6.5/x86_64/current/

• RHEL 6.6 WBEM Providers: downloads.linux.hp.com/SDR/repo/bl920-wbem/rhel/6.6/x86_64/current/

• RHEL 7.0 WBEM Providers: downloads.linux.hp.com/SDR/repo/bl920-wbem/rhel/7/x86_64/current/

Follow directions in the install guide for RHEL: downloads.linux.hp.com/SDR/repo/bl920-wbem/rhel/install_guide_rhel.txt

SLES Download the necessary packages of WBEM providers from downloads.linux.hp.com/repo/bl920-wbem/suse/11/x86_64/current.

Then install these packages—follow the guidance under downloads.linux.hp.com/repo/bl920-wbem/suse/install_guide_sles.txt.

Virtualization solutions RHEL 6.5 KVM The BL920s Gen8 server blade has passed the required Kernel-based Virtual Machine (KVM) tests for RHEL 6.5 certification. See the virtualization section in the RHEL 6.5 release notes for more information: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html-single/6.5_Release_Notes.

RHEL 7.0 KVM The BL920s Gen8 server blade has passed the required KVM tests for RHEL 7.0 certification. Please see the Virtualization section in the RHEL 7.0 Release Notes for more information: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/7.0_Release_Notes/chap-Red_Hat_Enterprise_Linux-7.0_Release_Notes-Virtualization.html.

SLES 11 SP3 KVM The BL920s Gen8 server blade has passed the required KVM tests for SLES 11 SP3 certification. Please see the Virtualization section in the SLES11 SP3 Release Notes for more information: suse.com/documentation/sles11/singlehtml/book_kvm/book_kvm.html

9

Technical white paper | Running Linux on HP BL920s Gen8

SLES 11 SP3 Xen SLES 11 SP3 Xen is not currently supported on the BL920s Gen8 server blade.

SR-IOV SR-IOV technology reduces the overhead of I/O processing and I/O latencies by allowing each VM to access a portion of the physical I/O device directly. This reduces the need for software emulation of I/O devices in memory. The SR-IOV specification defines the architecture for a new generation of I/O devices that have multiple Virtual Functions (VFs), each of which shares common properties of a PCI physical function (PF). This approach allows each VM to see its own separate virtual device while multiple VMs share the actual physical device.

See the HP SR-IOV Technology Overview for more details hp.com/h20195/v2/GetDocument.aspx?docname=4AA4-7189ENW

Linux Serviceguard feature Serviceguard Solutions for Linux, using an improved GUI, monitors the availability and accessibility of critical IT services such as applications and databases (DBs)—and everything they rely on. These are meticulously monitored for faults in hardware, software, OS, virtualization, storage, or network. When a failure or threshold violation is detected, HP Serviceguard Solutions for Linux automatically and transparently fails over and resumes normal operations in mere seconds, without compromising data integrity or performance.

By utilizing multiple copies of data and multiple data centers separated by any distance, you can maintain access to critical data and applications without impacting data integrity and performance even if a data center fails. The toolkits and extensions for Linux simplify and quickly integrate complex applications into a standardized and proven framework.

The Serviceguard extension for the SAP HANA database product in particular simplifies the administration of large and complex HANA database systems, increases the availability and operational efficiency of databases, and provides automatic DR capability with no manual intervention required to switch the operations from primary to secondary site.

Advanced features such as Live Application Detach (LAD) and Rolling Upgrades allow you to perform maintenance on a cluster or install upgrades for applications and the OS with zero planned downtime.

To know more about Serviceguard Solutions for Linux, refer to hp.com/go/sglx.

Recommended hardware configuration

Server Hyper-Threading The Xeon Processor E7-2890 v2 implements the Intel Hyper-Threading Technology feature. When enabled, Hyper-Threading allows a single physical processor core to run two independent threads of program execution. The operating system can use both logical processors, which often results in an increase in system performance. However, the two logical processors incur some overhead while sharing the common physical resources, so the pair does not have the same processing power, nor consumes the same electrical power, as two physical cores. Beyond that, the increased parallelism can cause increased contention for shared software resources, so some workloads can experience a degradation when Hyper-Threading is enabled.

On the HP BL920s Gen8, the Hyper-Threading state is an attribute of the partition, and therefore is controlled through interaction with the Onboard Administrator. By default, Hyper-Threading is enabled. The current state can be ascertained through the Onboard Administrator parstatus command:

parstatus –p <partition identifier> -V Hyper Threading at activation: Enabled Hyper Threading at next boot: Enabled

Note That both the current state and the state after the next boot are shown, because any change to the Hyper-Threading state will not take effect until the partition is rebooted.

The Onboard Administrator parmodify command can be used if it is desired to change the Hyper-Threading state:

parmodify –p <partition identifier> -T [y|n]

10

Technical white paper | Running Linux on HP BL920s Gen8

As a general rule of thumb, it may be desirable to change Hyper-Threading from the disabled to the enabled state if the workload that is running is fully utilizing all of the physical cores, as evidenced by a high value for CPU utilization. Another case where Hyper-Threading would be beneficial is if the physical cores are spending a relatively large amount of time waiting for cache misses to be satisfied.

If Hyper-Threading is enabled, it may help to disable it if the applications that are running tend to consume the entire capacity of the processor cache. The two logical processors share a common physical cache, so they may interfere with each other if the application cache footprint is large.

Performance analysis tools available through HP support teams can be used to gain insight into the benefit or detriment of using Hyper-Threading. You may just want to experiment running your workload in both states to see which one works better for your particular application.

Memory For performance reasons, it is highly recommended to have the same amount of memory on each blade in a partition.

Power management The BL920s Gen8 server blade implements a variety of mechanisms to govern the trade-off between power savings and peak performance. A more thorough discussion of other power per performance mechanisms appears in these references:

• Red Hat Enterprise Linux Power Management Guide: RHEL 6.5: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/index.html

• RHEL 7.0: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Power_Management_Guide/index.html

• SUSE Linux Enterprise Server 11 SP3 System Analysis and Tuning Guide—Chapter 11: Power management suse.com/documentation/sles11/book_sle_tuning/data/cha_tuning_power.html

I/O It is recommended that all disks connected to the BL920s Gen8 server blade over the two-port QLogic 16Gb FC adapters be set up multi-pathed for high reliability and hardware availability.

Recommended software configuration and tuning

Operating system Linux will run perfectly well out of the box on most small servers, but some minor administrative adjustments are recommended to get the highest value from the considerable hardware resources in the HP BL920s Gen8.

Installing on a QLogic Fibre Channel device To enable boot from a QLogic Fibre Channel device, the boot controller needs to be enabled from the HP Device Manager. HP recommends enabling only the primary and secondary boot controllers to avoid delays during system initialization.

1. When boot reaches the EFI Boot Manager menu, press U for System Utilities.

2. Select Device Manager.

3. For each HP QMH2672 16Gb 2P FC HBA - FC:

A. Select the device.

B. Select Boot Settings.

C. Toggle Adapter Driver to <Enabled>.

D. Toggle Selective Login to <Disabled>.

E. Toggle Selective Lun Login to <Disabled>.

F. Ctrl-W to save.

G. Ctrl-X, Ctrl-X to exit the Device Manager.

Always enable multipath during the install HP recommends enabling the Linux device mapper (DM) multipath during the Linux operating system installation. The DM multipath provides redundancy, reliability, and improved I/O performance.

Don’t use IPv6 Configuration of IPv6 networks on the BL920s Gen8 server blade is not supported, and it is recommended that IPv6 be disabled during installation.

11

Technical white paper | Running Linux on HP BL920s Gen8

Trim EFI boot entries Customers are advised to remove stale or unwanted boot options to save boot time. Use “efibootmgr” to remove the preboot execution environment (PxE) boot options when the OS is installed and remove the boot options to the old OS after installing a new one.

Configure serial console To enable the serial console, add the following to the kernel command line. (Enabling serial console will result in small performance penalty in boot time.)

console=ttyS0,115200n8

Enable timestamps for debugging For RHEL 6.5, add kernel parameter printk.time=y to the kernel command line, in order to add kernel timestamps, which are useful in debugging. (This is unnecessary on SLES, as it is already enabled by default.)

Adjust EFI memory map For RHEL 6.5, add add_efi_memmap to the kernel command line.

Maintaining the real-time clock The real-time clock (RTC) in a partition is synced to OA time when the partition is rebooted. If you change the time on the OA, it may affect the RTC on the partition.

It is recommended to configure network time protocol (NTP) on both the OA (as described in the OA user's guide) and the Linux partition. By configuring NTP on both, the time should be correct on initial boot of the npar, because the firmware sets it correctly, and it should stay correct as Linux runs by virtue of ongoing synchronization with the configured NTP servers.

For information on configuring NTP for RHEL 6.5: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sect-Date_and_Time_Configuration-Command_Line_Configuration-Network_Time_Protocol.html

For information on configuring NTP for SLES 11: suse.com/documentation/sles11/singlehtml/book_sle_admin

Kernel tuning Specific recommendations for tuning SLES 11 can be found in the following reference: suse.com/documentation/sles11/book_sle_tuning/data/book_sle_tuning.html

RHEL 6.5 performance tuning guide is found at the following reference: access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/index.html

NUMA considerations The Linux operating system includes algorithms to place memory close to the processor that accesses it. If the memory reference patterns of a particular application are well characterized, it may be possible to use specific tuning to achieve memory placement that is superior to the operating system's default behavior.

Given the NUMA characteristics of the BL920s Gen8 server blade platform, latency of accessing memory that is local to a given processor’s NUMA node is the lowest, followed by accessing memory off of a buddy NUMA node on the same blade, followed by accessing memory off of any NUMA node on a remote blade. It is important for applications to have their tasks and memory placed closer to one another. Doing so shall enable lower latency for memory accesses and result in better and predictable performance.

The Linux operating system includes algorithms to place memory close to the processor that accesses it. However, the application’s tasks or memory can over time migrate to other NUMA nodes resulting in a drop (or variability) in performance. Linux provides various tools like numactl (8), taskset (1), cpusets (7) to help the user bind a given workload’s tasks or memory to desired NUMA nodes. Tools like numastat (8) can be used to monitor a given process or workload’s memory usage across various NUMA nodes. For longer running workloads, user level tools like numad (available in RHEL 6.5) can be used either as an advisory service or as an executable to monitor the workloads’ characteristics and try to move its tasks or memory closer to another.

NUMA locality of I/O adapters plays a crucial role in the I/O performance. The irqbalance daemon, which runs by default, has the tendency to spread interrupt requests (IRQs) without paying attention to NUMA locality of the I/O adapters. Depending on the workload’s I/O characteristics, this may lead to poor performance. For the highest I/O performance, it is important to keep all the IRQs associated with a given I/O adapter distributed amongst the CPUs belonging to the local NUMA node. The irqbalance daemon should be disabled.

12

Technical white paper | Running Linux on HP BL920s Gen8

From the Linux host OS point of view, KVM virtual machines are an example of long-running multi-threaded application. Specifying bindings for the virtual machine’s resources to a specific set of host resources helps enable better and predictable performance for the workloads in the virtual machine. There is support available in libvirt/virsh (and numad) to help accomplish the same.

Scale-up workloads need to pay special attention to their data structure design or layout, granularity of locking, and the design of any application-specific locking primitives. Care must be taken to avoid too much bouncing of cache lines that need to be accessed exclusively. Too much cache line bouncing on any large symmetric multiprocessing (SMP) or NUMA system can severely impact application scaling.

Dump and crash dump analysis tools BL920s Gen8 server blade hardware and software are highly robust, but sometimes failures occur. In that case, it is important to recover from the failure completely and resume operation as soon as possible. Having a good crash dump can help in getting data to diagnose the problem. Here are the recommended configuration steps to enable crash dump:

General guidelines

1. Make sure there is enough space in the file system containing /var/crash to save a dump. Ideally, this file system should be equal to the memory size of the system, if a full dump is specified. But, in practice for the default dump type that excludes unneeded pages and with compression, 500 GB to 1 TB is generally enough to store a compressed dump from a system with 12 TB of memory. For idle systems, even 100 GB of free space is generally enough, but it is not possible to guarantee that a given size will be sufficient.

2. The time for the kdump utility to copy a dump for a 12 TB system, with default dump settings (dump level-31) and lzo compression, is an average of three hours, and times can vary from one to five hours on SLES11SP3, RHEL 6.5 and RHEL 6.6. On RHEL 7.0, there have been substantial performance improvements in dump, and a 12 TB dump there completes in an average of 1 hour.

SLES 11 SP3 guidelines

1. Set the crashkernel size correctly. The BL920s Gen8 requires a larger crashkernel size than the default to produce a successful dump. Always set crashkernel to 512 MB on all configurations of the BL920s Gen8. On a system installed with defaults, use yast2 kdump to set the crashkernel to 512 MB, and then reboot for this new value to take effect. It is best not to edit the kernel command line directly in elilo.conf file, but if you do, the kernel parameter line should be crashkernel=1024M-:512M.

2. After the reboot, to check if dump is enabled with the new crashkernel size, issue the command chkconfig boot.kdump and service boot.kdump status. The following dump-related rpm's need to be installed after the initial install, to fix several dump-related problems. Assuming the customer has registered for support or updates from SUSE with the purchase of the appropriate SLES 11 SP3 software for the BL920s Gen8, go to the suse.com website, and access Patch Finder to download the crash 9183 patch release from May 20, 2013. The following RPMs need to be downloaded and installed (these are the earliest revisions that fix the problems; later versions are also acceptable):

crash-6.0.7-0.16.1.x86_64.rpm

kdump-0.8.4-0.39.2.x86_64.rpm

makedumpfile-1.5.1-0.15.1.x86_64.rpm

3. Modify the kdump config. files to boot the crashkernel with multiple CPUs. To make sure dumps will work on all BL920s Gen8 configurations, follow these steps: Edit the /etc/sysconfig/kdump file to set:

KDUMP_CPUS="4" KDUMPTOOL_FLAGS="NOSPLIT"

Then rebuild kdump initrd by executing service boot.kdump restart.

4. Leave compression and dump level at their defaults, which are lzo and 31 (smallest dump size).

13

Technical white paper | Running Linux on HP BL920s Gen8

RHEL 6.5, 6.6, 7.0 guidelines

1. For RHEL 6.5 only, install RHEL 6.5 Maintenance Release Kernel kernel-2.6.32-431.20.3.el6.x86_64.rpm from the redhat.com website. This is needed to fix dump defects that affect the BL920s Gen8 system. Please see RHSA-2014-0771 for details.

2. Edit the /boot/efi/efi/redhat/grub.conf file to change the crashkernel=auto in the kernel command line to crashkernel=512M. RHEL 6.5 installations default to crashkernel=auto, which has a formula that sizes the crashkernel based on the memory size of the system. However, the values it chooses generally will not work with the BL920s Gen8, due to the larger I/O configurations it has.

3. To get a large enough file system space to save a dump, it is a good practice to create a separate file system for /var/crash, to avoid running out of memory in the root file system. To save dumps in a file system other than root, one must modify /etc/kdump.conf to change this parameter:

# <fs type> <partition> - Will mount -t <fs type> <partition> /mnt and copy

# /proc/vmcore to /mnt/<path>/127.0.0.1-%DATE/.

# NOTE: <partition> can be a device node, label or uuid.

For example, if /var/crash is itself a mounted volume, one must first identify it:

[root@dhd1 sysconfig]# mount|grep /var/crash

/dev/mapper/mpathap4 on /var/crash type ext4 (rw)

[root@dhd1 sysconfig]# blkid|grep mpathap4

/dev/mapper/mpathap4: UUID="6e0f0dbd-c68b-4623-8d07-5f7e571d4f22" TYPE="ext4"

Now that you know the universal unique identifier (UUID), edit these lines into /etc/kdump.conf:

path /

ext4 UUID=6e0f0dbd-c68b-4623-8d07-5f7e571d4f22

Note Failure to allocate sufficient file system space may result in saving dumps underneath a normally mounted volume, a situation from which it may be difficult to reclaim space under the mount point.

4. Modify the kdump config files to boot the crashkernel with multiple CPUs. Edit /etc/sysconfig/kdump file to change nr_cpus=1 to nr_cpus=4 in the line labeled KDUMP_COMMANDLINE_APPEND

5. ONLY for RHEL 6.5: Edit the /etc/sysconfig/kdump file to add: disable_cpu_apicid=X , where X is the apicid of CPU 0 obtained by executing cat /proc/cpuinfo | grep apicid | head -1

Unless there has been a failure in first processor in the first socket that deconfigures it, CPU 0 on all BL920s Gen8 server blades that have 15-core processors will have an apicid of “0” and CPU 0 on all BL920s Gen8 server blades with 10-core processors will have an apicid of “4”.

The edited line when done should look like the following example for 15-Core processor sockets:

KDUMP_COMMANDLINE_APPEND=”irqpoll nr_cpus=4 disable_cpu_apicid=0 reset_devices cgroup_disable=memory mce=off”

For 10-Core processor sockets: KDUMP_COMMANDLINE_APPEND=”irqpoll nr_cpus=4 disable_cpu_apicid=4 reset_devices cgroup_disable=memory mce=off”

Note Both RHEL 6.6 and RHEL 7.0 kexec-tools initialization scripts automatically determine the correct cpu 0 apicid on add it to the kdump command line automatically, so Step 5 ONLY has to be done when running on RHEL 6.5.

6. For better dump compression or copy performance, we recommend you edit the /etc/kdump.conf file to change the

core collector line from its default of:

core_collector makedumpfile –c --message-level 1 -d 31

14

Technical white paper | Running Linux on HP BL920s Gen8

to

core_collector makedumpfile –|--message-level 1 -d 31

This switches the dump from using zlib compression to lzo compression, which is much faster. On RHEL 7 kdump.conf file already sets the default to lzo compression

7. After all kdump configuration changes have been made, rebuild the kdump initrd to make them take effect, with the following commands:

Execute “touch /etc/kdump.conf”, to insure that kdump will rebuild the kdump initrd, when kdump is restarted.

RHEL 6: service kdump restart

RHEL 7: systemctl restart kdump

SAP HANA The BL920s Gen8 server blade is exceptionally well suited to running SAP HANA because of its impressive compute power and generous memory capacity. For HANA deployments, the version of Linux that is needed is SUSE SLES for SAP SP3. SAP HANA is supported on BL920s Gen8 server blade configured as a ConvergedSystem 900. More information is available at hp.com/go/sap/hana.

Software advisory notes

Finding Customer Advisories For the most up to date information about issues on this platform go to the following link: hp.com/us/en/drivers.html. Then search for BL920s, select your product and click on “Advisories, bulletins & notices” on the left. This will provide a list of the most recent advisories.

General known issues Boot messages The following messages will be seen while a BL920s Gen8 server blade is booting:

lpc_ich 0000:20:1f.0: I/O space for ACPI uninitialized lpc_ich 0000:20:1f.0: I/O space for GPIO uninitialized lpc_ich 0000:20:1f.0: No MFD cells added

-And-

iTCO_wdt: failed to get TCOBASE address, device disabled by hardware/BIOS

These messages indicate hardware on auxiliary blades is not initialized. Only the hardware on the monarch blade is initialized, and if a blade is auxiliary, this hardware is disabled. These messages are expected and are to be ignored. There will be one set of these messages for every auxiliary blade, that is, seven sets of them on an eight-blade system.

Suspend/Resume The hibernate/suspend/resume functionality is not supported on Dragonhawk. If by accident a hibernation operation is initiated on Dragonhawk, power cycle the system from the OA, and reboot the system in order to recover.

Boot message (ptp_clock_register failed) The following message may be seen while a BL920s Gen8 server blade is booting or a network driver is loaded after boot:

ixgbe 0000:85:00.0: ptp_clock_register failed

Only eight Precision Time Protocol (PTP)-capable network devices can be registered. After the limit of eight is reached, this error message is written to the message log as each additional network device is configured. Each NIC port counts as another network device.

Network devices registered for PTP allow the system to participate in IEEE 1588 Precision Time Protocol, which enables systems on the network to synchronize their clocks, thus allowing more precise time operations across systems.

The Linux ethtool utility can be used to query the status of PTP on a particular device. If the system does not participate in PTP satisfactorily, you can reconfigure the network to work around the eight-device limitation.

15

Technical white paper | Running Linux on HP BL920s Gen8

Clocksource failover Clocksource tsc unstable / Enable clocksource failover

This message may be seen occasionally on the host when running virtual machine guests. No action is necessary. Disregard the recommendation in the message to enable clocksource failover; HP specifically recommends against enabling clocksource failover on the BL920s Gen8 server blade.

Some ipmitool commands not supported The system firmware only includes Intelligent Platform Management Interface (IPMI) support for HP tools and applications. For example, sensor data record (SDR) status is not supported and you will see the below error message when executing the ipmitool command:

ipmitool sdr

Error obtaining SDR info: Invalid command Unable to open SDR for reading

SLES 11 SP3 known issues Long boot time At certain points while booting SLES 11 SP3, several minutes will elapse without output to the console. This is normal for configurations with more I/O and memory.

Userspace resume During boot, following is a console message regarding userspace resume:

Invoking userspace resume from /dev/disk/by-id/scsi-3600c0ff0001a852504bbce5201000000-part2

One or more instances of kernel soft lockup diagnostics may be printed.

The messages may persist for approximately two minutes. The messages may be ignored, in which case boot will continue after the two minute delay. The messages—and the resulting delay—may be prevented by addition of the noresume kernel parameter.

Note This defect is fixed in the Oct. 22, 2014 SLES11SP3 maintenance update kernel, now available from suse.com, which for the BL920s. Gen8 would be Linux kernel 9750, install kernel-bigsmp-3.0.101-0.40.1.x86_64.rpm, kernel-bigsmp-base-3.0.101-0.40.1.x86_64.rpm”.

Expected diagnostic message during boot This diagnostic is expected when booting SLES 11 SP3. You can ignore this message.

FATAL: Error inserting mgag200 (/lib/modules/3.0.101-0.15-default/kernel/drivers/gpu/drm/mgag200/mgag200.ko): Invalid argument

Out-of-sync cursor when using iLO remote console By default, SLES 11 SP3 creates a xorg.conf file that incorrectly sets up the iLO virtual mouse cursor as a relative pointing device. This results in two permanently out-of-sync cursors appearing on the screen. A fix is not available in SLES 11, but is present in SLES 12. For more information, ask your SUSE support representative to review Bug 715607. To work around this issue, the iLO virtual mouse cursor can be easily reconfigured as an absolute pointing device with the following edits to the xorg.conf file:

1. Locate the mouse InputDevice section in /etc/X11/xorg.conf.

2. Change the value of the Driver line from mouse to evdev.

3. Change the value of the Device line from /dev/input/mice to /dev/input/by-id/usb-HP_Virtual_Keyboard-event-mouse

4. Optionally, comment out the Protocol line, since the evdev driver does not require a protocol.

A revised InputDevice section:

Section "InputDevice" Driver "evdev" Identifier "Mouse[1]" Option "Buttons" "9" Option "Device" "/dev/input/by-id/usb-HP_Virtual_Keyboard-event-mouse"

16

Technical white paper | Running Linux on HP BL920s Gen8

Option "Name" "Avocent HP 336047-B21" # Option "Protocol" "explorerps/2" Option "Vendor" "Sysp" Option "ZAxisMapping" "4 5" EndSection

5. After saving these changes, restart X:

service xdm restart

Install multipath-tools update to be able to mount virtual removable media SUSE has released an application update for multipath-tools, available from their suse.com website (use Patch Finder) for customers who have support contracts with SUSE for updates. Without this fix, you will not be able to mount anything over the virtual USB media from the HP iLO Integrated Graphical Console (IRC) while the SLES OS is booted on the HP BL920s Gen8.

The following official RPMs correct this problem and another issue involving segmentation faults running the multipath command:

kpartx-0.4.9-0.97.1.x86_64.rpm multipath-tools-0.4.9-0.97.1.x86_64.rpm

These patches or later will resolve the issues. They must be installed together, due to dependencies.

EDD not supported During installation, the following message may be seen, because the BL980 Gen8 does not support EDD:

[ 26.767479] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found [ 26.774131] EDD information not available. insmod: error inserting '/modules/edd.ko': -1 No such device

Disregard this message; it causes no problems, and is due to the Linux kernel always trying to load the edd module whether the hardware supports it or not.

RHEL 6.5/6.6 and RHEL 7.0 known issues Configuring large KVM guests on RHEL 6.5 For large KVM guests (greater than 30 virtual CPUs), the following additional setup instructions are useful to avoid soft lockups, loss of Time Stamp Counter (TSC) sync, and other issues on the hosts and guests:

1. Install tuned package and use the profile virtual-host on the host.

tuned-adm profile virtual-host

2. Disable Pause Loop Exit (PLE) on the host by executing the following commands. (Repeat each time that the host is rebooted and prior to starting any virtual machine [VM] guests.)

modprobe –r kvm_intel

modprobe kvm_intel ple_gap=0

3. Before starting any VM guest, pin all the vCPUs of the VM to the host CPUs of choice using numactl utility.

When running large KVM guests, messages like the following will be seen and can be ignored on the host:

hrtimer: interrupt took 4389495 ns

Panic with Large Memory KVM Guests

If a KVM guest is configured with > 2 TB memory, the following panic stack trace will be seen on the host:

#5 [ffff88b647cdfaf0] general_protection at ffffffff8152b705 [exception RIP: kvm_unmap_rmapp+32] RIP: ffffffffa0215820 RSP: ffff88b647cdfba8 RFLAGS: 00010206 RAX: 05b6600000000000 RBX: ffff88b647cd4000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8e1ebcc58bc8 RBP: ffff88b647cdfbc8 R8: 0000000000000000 R9: 0000000000000000 R10: 0000000000000000 R11: 00007f2f8f73d000 R12: ffffc9069d40e9e0 R13: 0000000000000000 R14: 05b6600000000000 R15: ffff88b647cd4038 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffff88b647cdfbd0] kvm_handle_hva at ffffffffa0213294 [kvm] #7 [ffff88b647cdfc40] kvm_unmap_hva at ffffffffa0213337 [kvm] #8 [ffff88b647cdfc50] kvm_mmu_notifier_invalidate_range_start at

17

Technical white paper | Running Linux on HP BL920s Gen8

ffffffffa01fdb32 [kvm] #9 [ffff88b647cdfca0] __mmu_notifier_invalidate_range_start at ffffffff8116a83c #10 [ffff88b647cdfce0] unmap_vmas at ffffffff811487af #11 [ffff88b647cdfe20] zap_page_range at ffffffff81148f51 #12 [ffff88b647cdfe90] sys_madvise at ffffffff81144d11 #13 [ffff88b647cdff80] system_call_fastpath at ffffffff8100b072

To avoid this host panic it is recommended that KVM guests be configured with no larger than 2 TB of memory.

Crash dump will fail on large memory systems with crashkernel size left at default value of auto The symptom is that the kdump kernel cannot load, so kdump will be disabled after a reboot. To avoid this, if you are installing via virtual media or DVD media, during first boot, when you get the kdump configuration screen, set crashkernel=512M. Otherwise, after your system boots up, edit the /boot/efi/efi/redhat/grub.conf file to replace crashkernel=auto with crashkernel=512M in the kernel command line.

NMI panic message On RHEL 6.5 and RHEL 7.0, when an NMI is issued from the OA to induce a crash dump on a hung system, the panic message that is printed out (Kernel panic—not syncing: An NMI occurred, please see the Integrated Management Log for details) is not correct for the BL920s Gen8 server blade, which does not have this log. Instead, for information about the NMI that caused the panic, on the BL920s Gen8 server blade consult the OA FPL log (OA> show FPL) or the OA syslogs (OA>show OA syslog).

The panic message for the NMI prints correct information for the BL920s Gen8 server blade in RHEL 6.6 and in RHEL 7.0 maintenance update kernel from September 22, 2013 or later).

GNOME system monitor If the BL920s Gen8 server blade has more than 240 CPUs (an eight-blade system with HT on will have 480 CPUs), the GNOME System Monitor utility will report incorrect numbers of CPUs generally in the range of 200–255, in its display, sometimes accompanied by glibtop errors. This is due to a scaling issue in the glibtop software, and will be fixed in a future version of GNOME system monitor and glibtop.

Boot messages During boot on the BL920s Gen8, several messages like the following may be seen:

pci 0000:71:00.1: BAR 2: can't assign io (size 0x20)

These messages are due to the large I/O card configurations of this system, which exceeds the I/O port space the firmware has reserved. This I/O port space is not needed by most of the cards, and the few that do need it get their space allocated first in the initialization process. But the OS still tries to allocate I/O port space for all the cards, regardless, so these messages are repeated as the cards on the higher numbered nodes are initialized. Ignore these messages.

During boot, the following message during Advanced Configuration and Power Interface (ACPI) thermal management initialization will be seen, and can be ignored as it causes no issues:

ACPI Exception: AE_NOT_FOUND, No or invalid critical threshold (20090903/thermal-386)

Network install If the RHEL 6.5 GRUB is used as the boot loader when booting over LAN from a PxE server, the installation cannot be done using the Integrated Remote Console (graphical console). It appears to boot, however, after the line:

console [tty0] enabled, bootconsole disabled

Nothing shows up on the Integrated Remote Console. The workaround is to do all PxE installs using the serial console, and specifying vnc=1 in the command line, or to use elilo as the boot loader for the PxE server. This issue is fixed in the RHEL 6.6 GRUB.

Stack trace upon reboot (RHEL 6.5 and RHEL 7.0) When rebooting the BL920s Gen8 server blade, sometimes after the Restarting system message, the following stack trace will be seen. This is also frequently seen when the system restarts after collecting a crash dump. Please ignore this warning and stack trace; it causes no problems with the shutdown or normal operation of the system.

Restarting system. machine restart ------------[ cut here ]------------ WARNING: at kernel/smp.c:417 smp_call_function_many+0xb9/0x260() (Not tainted) Modules linked in: lpfc nf_conntrack_ipv6 scsi_transport_fc sd_mod

18

Technical white paper | Running Linux on HP BL920s Gen8

dm_round_robin ext4 ses lpc_ich vfat ip6table_filter xt_state nf_defrag_ipv6 ip6t_REJECT iptable_filter nf_conntrack_ipv4 acpi_cpufreq cpufreq_ondemand scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua dm_multipath crc_t10dif scsi_tgt mbcache jbd2 be2net enclosure sg hpwdt hpilo mfd_core fat ipv6 ip6_tables nf_conntrack ip_tables nf_defrag_ipv4 ipt_REJECT mperf freq_table autofs4 dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod Pid: 1910, comm: reboot Not tainted 2.6.32-424-jah4 #4 Call Trace: [<ffffffff81072027>] ? warn_slowpath_common+0x87/0xc0 [<ffffffff8104fd50>] ? do_flush_tlb_all+0x0/0x60 [<ffffffff8104fd50>] ? do_flush_tlb_all+0x0/0x60 [<ffffffff8107207a>] ? warn_slowpath_null+0x1a/0x20 [<ffffffff810b3af9>] ? smp_call_function_many+0xb9/0x260 [<ffffffff8104fd50>] ? do_flush_tlb_all+0x0/0x60 [<ffffffff810b3cc2>] ? smp_call_function+0x22/0x30 [<ffffffff8107a7c4>] ? on_each_cpu+0x24/0x50 [<ffffffff8104facc>] ? flush_tlb_all+0x1c/0x20 [<ffffffff81157bea>] ? __purge_vmap_area_lazy+0xea/0x1e0 [<ffffffff81157d40>] ? free_vmap_area_noflush+0x60/0x70 [<ffffffff81157df5>] ? free_unmap_vmap_area+0x25/0x30 [<ffffffff81157e40>] ? remove_vm_area+0x40/0xa0 [<ffffffff8104b399>] ? iounmap+0x99/0xe0 [<ffffffff812ecbd9>] ? acpi_os_write_memory+0xaf/0xb8 [<ffffffff81304482>] ? acpi_hw_write+0x40/0x55 [<ffffffff813052cd>] ? acpi_reset+0x4d/0x58 [<ffffffff812ed6c9>] ? acpi_reboot+0xbd/0xc8 [<ffffffff81030256>] ? native_machine_emergency_restart+0x1e6/0x230 [<ffffffff81034e2a>] ? disable_IO_APIC+0xba/0x140 [<ffffffff8102ff37>] ? native_machine_restart+0x37/0x40 [<ffffffff8102fe4f>] ? machine_restart+0xf/0x20 [<ffffffff81092f3e>] ? kernel_restart+0x3e/0x60 [<ffffffff81093117>] ? sys_reboot+0x197/0x240 [<ffffffff8105591d>] ? check_preempt_curr+0x6d/0x90 [<ffffffff81065e5e>] ? try_to_wake_up+0x24e/0x3e0 [<ffffffff81066045>] ? wake_up_process+0x15/0x20 [<ffffffff8152ddce>] ? do_page_fault+0x3e/0xa0 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b ---[ end trace f473b2877388c347 ]---

Note The stack trace upon reboot issue is fixed in RHEL 6.6, but not yet fixed in RHEL 7.0.

Soft lockup during dump Occasionally, a dump may be lost due to soft lockups that hang the crashkernel boot. The signature stack traces are:

usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default HPET: 8 timers in total, 4 timers will be used for per-cpu timer hpet0: at MMIO 0xfed00000, IRQs 2, 8, 296, 297, 298, 299, 0, 0 hpet0: 8 comparators, 64-bit 14.318180 MHz counter Switching to clocksource hpet BUG: soft lockup - CPU#0 stuck for 61s! [swapper:0] Modules linked in: CPU 0 Modules linked in:

19

Technical white paper | Running Linux on HP BL920s Gen8

Pid: 0, comm: swapper Not tainted 2.6.32-431.20.1.el6.x86_64 #1 RIP: 0010:[<ffffffff810e6b59>] [<ffffffff810e6b59>] handle_IRQ_event+0x29/0x170 RSP: 0018:ffff8800032037f0 EFLAGS: 00000246 RAX: ffff8800226737ec RBX: ffff880003203830 RCX: 000000000000080b RDX: 0000000000000000 RSI: ffff880022319ec0 RDI: 0000000000000121 RBP: ffffffff8100b9d3 R08: ffffffff81a00000 R09: ffff880003203950 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880003203770 R13: 0000000000000121 R14: 0000000000000121 R15: ffff8800226737ec FS: 0000000000000000(0000) GS:ffff880003200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000004a85000 CR4: 00000000001407b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020) Stack: ffffffff81037739 0000000000000121 ffff880003203830 ffff880022673780 <d> ffff8800226737ec 0000000000000121 ffff8800032038a8 ffffffff81de2000 <d> ffff880003203870 ffffffff810e94ee ffff880003203950 0000000000000004 Call Trace: <IRQ> [<ffffffff81037739>] ? native_apic_msr_eoi_write+0x19/0x20 [<ffffffff810e94ee>] ? handle_edge_irq+0xde/0x180 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 [<ffffffff81531f0c>] ? do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 [<ffffffff810e6b59>] ? handle_IRQ_event+0x29/0x170 [<ffffffff81037739>] ? native_apic_msr_eoi_write+0x19/0x20 [<ffffffff810e94ee>] ? handle_edge_irq+0xde/0x180 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 [<ffffffff81531f0c>] ? do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 [<ffffffff810e6b59>] ? handle_IRQ_event+0x29/0x170 [<ffffffff81037739>] ? native_apic_msr_eoi_write+0x19/0x20 [<ffffffff810e94ee>] ? handle_edge_irq+0xde/0x180 [<ffffffff810a6dc9>] ? ktime_get+0x69/0xf0 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 [<ffffffff81531f0c>] ? do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 [<ffffffff810e6b59>] ? handle_IRQ_event+0x29/0x170 [<ffffffff81037739>] ? native_apic_msr_eoi_write+0x19/0x20 [<ffffffff810e94ee>] ? handle_edge_irq+0xde/0x180 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 [<ffffffff81531f0c>] ? do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 [<ffffffff810e6b59>] ? handle_IRQ_event+0x29/0x170 [<ffffffff81037739>] ? native_apic_msr_eoi_write+0x19/0x20 [<ffffffff810e94ee>] ? handle_edge_irq+0xde/0x180 [<ffffffff8105dc7e>] ? scheduler_tick+0x11e/0x260 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0 [<ffffffff81531f0c>] ? do_IRQ+0x6c/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 [<ffffffff8107a5a3>] ? __do_softirq+0x73/0x1e0 [<ffffffff810e6b90>] ? handle_IRQ_event+0x60/0x170 [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30 [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0 [<ffffffff8107a4a5>] ? irq_exit+0x85/0x90 [<ffffffff81531f15>] ? do_IRQ+0x75/0xf0 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11<EOI> [<ffffffff81016667>] ? mwait_idle+0x77/0xd0 [<ffffffff8152e4aa>] ? atomic_notifier_call_chain+0x1a/0x20 [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 [<ffffffff8150da5a>] ? rest_init+0x7a/0x80 [<ffffffff81c26f8f>] ? start_kernel+0x424/0x430 [<ffffffff81c2633a>] ? x86_64_start_reservations+0x125/0x129

20

Technical white paper | Running Linux on HP BL920s Gen8

[<ffffffff81c26453>] ? x86_64_start_kernel+0x115/0x124

Summary

Under the banner of Project Odyssey, HP is committed to providing a long roadmap of solutions for mission-critical computing based on the Linux operating system. The solutions are grounded on solid hardware platforms, augmented by HP system software. We are working with the open source community to make continuous improvements in the Linux operating system and application stacks. The BL920s Gen8 server blade solution is described in this white paper. Further innovations will be released to the market on a regular basis. This white paper will be updated as further enhancements are announced.

21

Technical white paper | Running Linux on HP Integrity Superdome X

Resources

HP Integrity Superdome X QuickSpecs hp.com/h20195/v2/GetDocument.aspx?docname=c04383189

Red Hat Enterprise Linux from HP hp.com/us/en/products/operating-systems/product-detail.html?oid=5393115#!tab%3Dfeatures

RHEL 7.0 certification information hp.com/us/en/enterprise/servers/supportmatrix/redhat_linux.aspx

RHEL 6.6 - Release Notes access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.6_Release_Notes/index.html

RHEL Technical Notes access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.6_Technical_Notes/index.html

SUSE Linux Enterprise Server from HP hp.com/us/en/products/operating-systems/product-detail.html?oid=5379857#!tab=features

Sign up for updates hp.com/go/getupdated

Share with colleagues

Rate this document

© Copyright 2014, 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.

Intel and Intel Xeon are trademarks of Intel Corporation in the U.S. and other countries. Java is a registered trademark of Oracle and/or its affiliates. Red Hat is a registered trademark of Red Hat, Inc. in the United States and other countries. SAP is the trademark or registered trademark of SAP SE in Germany and in several other countries. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.

4AA5-4775ENW, January 2015, Rev. 2