ibm pe for aix and linux v5 r1: installation - hhlr · 2020-02-19 · ibm pe for aix and linux v5...
TRANSCRIPT
Parallel Environment for AIX and Linux
Installation
Version 5 Release 1
SC23-6666-00
���
Parallel Environment for AIX and Linux
Installation
Version 5 Release 1
SC23-6666-00
���
Note
Before using this information and the product it supports, read the information in “Notices” on page 93.
First Edition (November 2008)
This edition applies to version 5, release 1, modification 0 of IBM Parallel Environment for AIX (product number
5765-PEA) and version 5, release 1, modification 0 of IBM Parallel Environment for Linux (product number
5765-PEL) and to all subsequent releases and modifications until otherwise indicated in new editions.
IBM welcomes your comments. A form for readers’ comments may be provided at the back of this publication, or
you can send your comments to the following address:
International Business Machines Corporation
Department 58HA, Mail Station P181
2455 South Road
Poughkeepsie, NY 12601-5400
United States of America
FAX (United States & Canada): 1+845+432-9405
FAX (Other Countries):
Your International Access Code +1+845+432-9405
IBMLink™ (United States customers only): IBMUSM10(MHVRCFS)
Internet e-mail: [email protected]
If you want a reply, be sure to include your name, address, and telephone or FAX number.
Make sure to include the following in your comment or note:
v Title and order number of this publication
v Page number or topic related to your comment
When you send information to IBM, you grant IBM a nonexclusive right to use or distribute the information in any
way it believes appropriate without incurring any obligation to you.
© Copyright International Business Machines Corporation 1993, 2008.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
Contents
Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
About this information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Who should read this information . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
How this information is organized . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Conventions and terminology used in this information . . . . . . . . . . . . . . . . . . . . . x
Abbreviated names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Prerequisite and related information . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
How to send your comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
National language support (NLS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Functional restrictions for PE 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Changes for PE 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Chapter 1. Introducing PE 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 1
PE components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 2. Planning to install the PE software . . . . . . . . . . . . . . . . . . . 3
PE for AIX installation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
PE for AIX hardware requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
PE for AIX software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Disk space requirements for AIX installation . . . . . . . . . . . . . . . . . . . . . . . . 6
PE for Linux installation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
PE for Linux hardware requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 6
PE for Linux software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Disk space requirements for Linux installation . . . . . . . . . . . . . . . . . . . . . . . 10
PE Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Information for the system administrator . . . . . . . . . . . . . . . . . . . . . . . . . 10
Software compatibility within workstation clusters . . . . . . . . . . . . . . . . . . . . . 11
Node resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Deciding which nodes require which PE filesets or RPMs, or additional software . . . . . . . . . . . 12
Enabling xinetd for Linux installation . . . . . . . . . . . . . . . . . . . . . . . . . . 12
File systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
User IDs on remote nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
PE for AIX user authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
POE security method configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Cluster based security configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 13
AIX-based security (compatibility) . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
PE for Linux user authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Tuning your Linux system for more efficient parallel job performance . . . . . . . . . . . . . . . . 15
Running large POE jobs and IP buffer usage (PE for AIX only) . . . . . . . . . . . . . . . . . . 16
Chapter 3. Installing the Parallel Environment . . . . . . . . . . . . . . . . . . . 17
Installing the PE for AIX software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
About installing PE for AIX with CSM . . . . . . . . . . . . . . . . . . . . . . . . . 17
About installing PE for AIX on an IBM Power Systems cluster . . . . . . . . . . . . . . . . . 17
Migration installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
When to install the rsct.lapi.rte fileset . . . . . . . . . . . . . . . . . . . . . . . . . . 19
When to install the rsct.lapi.bsr fileset . . . . . . . . . . . . . . . . . . . . . . . . . 19
When to install the rsct.core.sec fileset . . . . . . . . . . . . . . . . . . . . . . . . . 19
When to install the loadl.so (LoadLeveler) fileset . . . . . . . . . . . . . . . . . . . . . . 19
View the readme file before installation . . . . . . . . . . . . . . . . . . . . . . . . . 19
PE for AIX installation procedure summary . . . . . . . . . . . . . . . . . . . . . . . 20
Install the PE for AIX filesets step-by-step . . . . . . . . . . . . . . . . . . . . . . . . 20
© Copyright IBM Corp. 1993, 2008 iii
Performing PE for AIX post installation tasks (optional) . . . . . . . . . . . . . . . . . . . . 28
Enabling the barrier sychronization register (BSR) . . . . . . . . . . . . . . . . . . . . . 28
Installing the PE for Linux software . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Installing PE for Linux manually . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Installing PE for Linux using the pe_install.sh script . . . . . . . . . . . . . . . . . . . . . 31
Performing PE for Linux post installation tasks (optional) . . . . . . . . . . . . . . . . . . . 35
Resolving installation errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Chapter 4. Migrating and upgrading PE . . . . . . . . . . . . . . . . . . . . . . 39
Migrating and upgrading PE for AIX . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
PE for AIX migration overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
AIX compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Migration support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
AIX support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
MPI library support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Barrier synchronization register (BSR) support . . . . . . . . . . . . . . . . . . . . . . . 41
LAPI support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Fortran 90 compile time type-checking support . . . . . . . . . . . . . . . . . . . . . . 41
Online documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Migrating and upgrading PE for Linux . . . . . . . . . . . . . . . . . . . . . . . . . . 42
PE for Linux migration overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Installing an upgrade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Migration support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
LAPI support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Fortran 90 compile time type-checking support . . . . . . . . . . . . . . . . . . . . . . 45
Chapter 5. Performing installation-related tasks . . . . . . . . . . . . . . . . . . 47
Performing PE for AIX installation-related tasks . . . . . . . . . . . . . . . . . . . . . . . 47
Removing a software component . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Recovering from a software vital product database error . . . . . . . . . . . . . . . . . . . 47
Customizing the message catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Installing AFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Performing PE for Linux installation-related tasks . . . . . . . . . . . . . . . . . . . . . . 49
Finding installed components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Removing a software component . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Customizing the message catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Chapter 6. Understanding how installing PE alters your system . . . . . . . . . . . 53
Understanding how installing PE for AIX alters your system . . . . . . . . . . . . . . . . . . . 53
How installing the POE fileset alters your system . . . . . . . . . . . . . . . . . . . . . 53
How installing PDB alters your system . . . . . . . . . . . . . . . . . . . . . . . . . 57
How installing the online documentation alters your system . . . . . . . . . . . . . . . . . . 57
Understanding how installing PE for Linux alters your system . . . . . . . . . . . . . . . . . . 58
How installing the PE license RPM alters your system . . . . . . . . . . . . . . . . . . . . 58
How installing the PE and LAPI RPMs alters your system . . . . . . . . . . . . . . . . . . . 59
Chapter 7. Additional information for the system administrator . . . . . . . . . . . 63
Using the /etc/poe.limits file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Entries in the /etc/poe.limits file . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
How the Partition Manager daemon handles the /etc/poe.limits file . . . . . . . . . . . . . . . 64
Description of /etc/poe.security . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Configuring the Parallel Environment coscheduler . . . . . . . . . . . . . . . . . . . . . . 65
POE coscheduling parameters and limits . . . . . . . . . . . . . . . . . . . . . . . . 65
AIX dispatcher tuning (PE for AIX only) . . . . . . . . . . . . . . . . . . . . . . . . . 68
Enabling Remote Direct Memory Access (RDMA) . . . . . . . . . . . . . . . . . . . . . . 69
Enabling RDMA for use with the IBM High Performance Switch (PE for AIX only) . . . . . . . . . . 70
Enabling RDMA for use with the InfiniBand interconnect . . . . . . . . . . . . . . . . . . . 71
Configuring InfiniBand for User Space without LoadLeveler (PE for AIX only) . . . . . . . . . . . . . 72
iv IBM PE for AIX and Linux V5 R1: Installation
Compiling and installing the NRT API samples . . . . . . . . . . . . . . . . . . . . . . 72
Chapter 8. Syntax of commands for running installation and deinstallation scripts . . . 75
PE for AIX installation script: PEinstall . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Copying the installation image . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Mounting the installation image . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
PE for AIX deinstallation script: PEdeinstall . . . . . . . . . . . . . . . . . . . . . . . . 76
PE for Linux installation script: pe_install.sh . . . . . . . . . . . . . . . . . . . . . . . . 77
PE for Linux deinstallation script: pe_deinstall.sh . . . . . . . . . . . . . . . . . . . . . . . 78
Chapter 9. Installation verification program summary . . . . . . . . . . . . . . . 81
Chapter 10. Using additional POE sample applications . . . . . . . . . . . . . . . 83
Bandwidth measurement test sample . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Verification steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Broadcast test sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Verification steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
MPI threads sample program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Verification steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
LAPI sample programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Chapter 11. Parallel Environment port usage . . . . . . . . . . . . . . . . . . . 89
PE for AIX port usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
PE for Linux port usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Appendix. Accessibility features for Parallel Environment . . . . . . . . . . . . . . 91
Accessibility features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
IBM and accessibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Contents v
vi IBM PE for AIX and Linux V5 R1: Installation
Tables
1. Typographic conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
2. Specifying the default message catalog with the NLSPATH environment variable . . . . . . . . . . xii
3. Location of PE message catalogs . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
4. PE Fileset requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
5. Additional software requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 4
6. Disk space requirements for installation . . . . . . . . . . . . . . . . . . . . . . . . 6
7. RPMs required for installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
8. RPM and disk space requirements for installation . . . . . . . . . . . . . . . . . . . . . 10
9. Network tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
10. Filesets to remove before installation . . . . . . . . . . . . . . . . . . . . . . . . . 18
11. Installation procedure summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
12. Step 2 for installing with CSM . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
13. Method 1: Use the installp command . . . . . . . . . . . . . . . . . . . . . . . . . 22
14. File names for different types of installations . . . . . . . . . . . . . . . . . . . . . . 23
15. Steps to take to determine steps remaining . . . . . . . . . . . . . . . . . . . . . . . 24
16. Specify -copy and -mount . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
17. File names for different data types . . . . . . . . . . . . . . . . . . . . . . . . . . 25
18. Steps to take to determine steps remaining . . . . . . . . . . . . . . . . . . . . . . . 26
19. Space requirements for the partition manager daemon and poe components . . . . . . . . . . . . 49
20. POE directories and files installed . . . . . . . . . . . . . . . . . . . . . . . . . . 53
21. ppe.poe.post_i symbolic links . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
22. PDB directories and files installed . . . . . . . . . . . . . . . . . . . . . . . . . . 57
23. Man page directories and files installed . . . . . . . . . . . . . . . . . . . . . . . . 58
24. Directories and files associated with the PE license RPM . . . . . . . . . . . . . . . . . . 58
25. Directories and files installed as a result of accepting the license agreement . . . . . . . . . . . . 58
26. Directories and files associated with the PE RPMs . . . . . . . . . . . . . . . . . . . . 59
27. Directories and files associated with the LAPI RPMs . . . . . . . . . . . . . . . . . . . . 59
28. Symbolic links created during PE RPM installation . . . . . . . . . . . . . . . . . . . . 60
29. Symbolic links created during LAPI RPM installation . . . . . . . . . . . . . . . . . . . 61
30. PE for AIX port usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
31. PE for Linux port usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
© Copyright IBM Corp. 1993, 2008 vii
viii IBM PE for AIX and Linux V5 R1: Installation
About this information
IBM® Parallel Environment: Installation describes how to install the Parallel
Environment program product on a variety of hardware, running the AIX or Linux
operating system.
This information supports the following program products:
v IBM Parallel Environment for AIX (5765-PEA), Version 5 Release 1 Modification
0
v IBM Parallel Environment for Linux® (5765-PEL) Version 5 Release 1
Modification 0
To make this information easier to read, the name IBM Parallel Environment has
been abbreviated to PE for AIX, PE for Linux, or more generally, PE throughout.
To use this information, you should be familiar with the AIX or Linux operating
system. Where necessary, background information related to AIX or Linux is
provided but, more commonly, it refers you to the appropriate documentation.
For AIX users:
The PE for AIX information assumes that one of the following is already installed:
v AIX Version 5.3 Technology Level 5300-09 (AIX V5.3 TL 5300-09)
v AIX Version 6.1 (or later), either standalone or connected by way of an Ethernet
LAN supporting IP.
For information on installing AIX® see the AIX Installation Guide and Reference.
Note: AIX Version 5.3 Technology Level 5300-09 identifies the specific AIX 5.3
maintenance levels that are required to run PE 5.1.0. The name AIX 5.3 is
used in more general discussions.
For Linux users:
The PE for Linux information assumes that one of the following Linux
distributions is already installed:
v SUSE LINUX Enterprise Server (SLES) 10
v Red Hat Enterprise Linux 5, Update 2
Note that PE for Linux is based on its predecessor, PE for AIX, with which you
may be familiar.
Who should read this information
This information is intended for system programmers and administrators, who
plan, migrate, and install PE.
How this information is organized
This information is organized as follows:
© Copyright IBM Corp. 1993, 2008 ix
v Introducing PE 5.1.0 is an overview of PE, describing how its various software
components work together. This introduction also describes some installation
considerations based on your system’s configuration.
v Planning to install the PE software contains the planning information you need to
consider before installing PE. Topics include the hardware and software
requirements, as well as information on node resources, file systems, and user
ID administration.
v Installing the Parallel Environment contains the step-by-step procedure you need
to follow to install PE. This chapter also lists, and describes, the product
directories created and the links established by the installation process.
v Migrating and upgrading PE contains specific information on some differences
between earlier releases that you may need to consider before installing or using
PE 5.1.0.
v Performing installation-related tasks describes additional procedures (such as
removing an installation image and customizing the message catalog) that are
related to installing PE.
v Understanding how installing PE alters your system describes how your system is
altered when you install the various PE software file sets.
v Additional information for the system administrator describes the format of PE
configuration files that are created and modified by the system administrator.
v Syntax of commands for running installation and deinstallation scripts explains the
syntax of the commands for running the installation and deinstallation scripts
provided with PE.
v Installation verification program summary explains how the POE Installation
Verification Program (IVP) works.
v Using additional POE sample applications describes how to use sample applications
for measuring the MPI point-to-point communication bandwidth between two
tasks, broadcasting from task 0 to the all of the other nodes in the partition, and
for using the MPI message passing library with user-created threads.
v Parallel Environment port usage describes some sample applications.
Conventions and terminology used in this information
Note that in this information, LoadLeveler® is also referred to as Tivoli® Workload
Scheduler LoadLeveler and TWS LoadLeveler.
This information uses the following typographic conventions:
Table 1. Typographic conventions
Convention Usage
bold Bold words or characters represent system elements that you must
use literally, such as: command names, file names, flag names, path
names, PE component names (poe, for example), and subroutines.
constant width Examples and information that the system displays appear in
constant-width typeface.
italic Italicized words or characters represent variable values that you
must supply.
Italics are also used for book titles, for the first use of a glossary
term, and for general emphasis in text.
[item] Used to indicate optional items.
<Key> Used to indicate keys you press.
x IBM PE for AIX and Linux V5 R1: Installation
Table 1. Typographic conventions (continued)
Convention Usage
\ The continuation character is used in coding examples in this
information for formatting purposes.
In addition to the highlighting conventions, this information uses the following
conventions when describing how to perform tasks.
User actions appear in uppercase boldface type. For example, if the action is to
enter the tool command, this information presents the instruction as:
ENTER
tool
Abbreviated names
Some of the abbreviated names used in this information follow.
AIX Advanced Interactive Executive
CSM Clusters Systems Management
CSS communication subsystem
CTSEC cluster-based security
dsh distributed shell
GUI graphical user interface
HDF Hierarchical Data Format
IP Internet Protocol
LAPI Low-level Application Programming Interface
MPI Message Passing Interface
PE IBM Parallel Environment for AIX or IBM Parallel Environment for
Linux
PE MPI IBM’s implementation of the MPI standard for PE
PE MPI-IO IBM’s implementation of MPI I/O for PE
POE parallel operating environment
RSCT Reliable Scalable Cluster Technology
rsh remote shell
STDERR standard error
STDIN standard input
STDOUT standard output
System x™ IBM System x
Prerequisite and related information
The Parallel Environment library consists of:
v IBM Parallel Environment: Installation, SC23-6666
v IBM Parallel Environment: Operation and Use, SC23-6667
v IBM Parallel Environment: Messages, SC23-6669
About this information xi
v IBM Parallel Environment: MPI Programming Guide, SC23-6670
v IBM Parallel Environment: MPI Subroutine Reference, SC23-6671
To access the most recent Parallel Environment documentation in PDF and HTML
format, refer to the IBM Clusters Information Center, on the Web.
Both the current Parallel Environment books and earlier versions of the library are
also available in PDF format from the IBM Publications Center on the Web.
It is easiest to locate a book in the IBM Publications Center by supplying the
book’s publication number. The publication number for each of the Parallel
Environment books is listed after the book title in the preceding list.
How to send your comments
Your feedback is important in helping to provide the most accurate and
high-quality information. If you have comments about this information or other PE
documentation:
v Send your comments by e-mail to: [email protected]
Be sure to include the name of the book, the part number of the book, the
version of PE, and, if applicable, the specific location of the text you are
commenting on (for example, a page number or table number).
v Fill out one of the forms at the back of this book and return it by mail, by fax, or
by giving it to an IBM representative.
National language support (NLS)
For national language support (NLS), all PE components and tools display
messages that are located in externalized message catalogs. English versions of the
message catalogs are shipped with the PE licensed program, but your site may be
using its own translated message catalogs. The PE components use the
environment variable NLSPATH to find the appropriate message catalog.
NLSPATH specifies a list of directories to search for message catalogs. The
directories are searched, in the order listed, to locate the message catalog. In
resolving the path to the message catalog, NLSPATH is affected by the values of
the environment variables LC_MESSAGES and LANG. If you get an error saying
that a message catalog is not found and you want the default message catalog, do
the following.
Table 2. Specifying the default message catalog with the NLSPATH environment variable
If you are using PE for AIX: If you are using PE for Linux:
ENTER
export NLSPATH=/usr/lib/nls/msg/%L/%N
export LANG=C
ENTER
export NLSPATH=/usr/share/locale/%L/%N
export LANG=en_US
The PE message catalogs are in English, and are located in the following
directories.
xii IBM PE for AIX and Linux V5 R1: Installation
Table 3. Location of PE message catalogs
If you are using PE for AIX: If you are using PE for Linux:
/usr/lib/nls/msg/C
/usr/lib/nls/msg/En_US
/usr/lib/nls/msg/en_US
/usr/share/locale/C
/usr/share/locale/En_US
/usr/share/locale/en_US
/usr/share/locale/en_US.UTF-8
If your site is using its own translations of the message catalogs, consult your
system administrator for the appropriate value of NLSPATH or LANG.
PE for AIX users can refer to AIX: General Programming Concepts: Writing and
Debugging Programs for more information on NLS and message catalogs.
Functional restrictions for PE 5.1
Functional restrictions for PE for AIX 5.1:
v Because PE Version 5 Release 1 exploits the barrier synchronization register
(BSR), any user attempting a read-modify-write operation on MPI library
allocated storage could inadvertantly affect the memory that is mapped to the
BSR register. Any such access will lead to unpredictable results.
v PE Version 5.1 requires LAPI Version 2.4.6 for AIX 5.3, and LAPI 3.1.2 for AIX
6.1. Earlier versions of LAPI are not supported.
Functional restrictions for PE for Linux 5.1:
Although many of the following functions, are currently available with Parallel
Environment for AIX, they are not supported by Parallel Environment for Linux
5.1:
v Checkpoint and restart
v Lightweight core files
v Use of large memory pages
v PE does not support User Space jobs on IBM System x™ hardware.
v User Space jobs with Red Hat Enterprise Linux, when running on IBM Power
Systems servers.
v The High Performance Computing Toolkit (HPC Toolkit) is not supported on
IBM System x hardware.
Summary of changes
Changes for PE 5.1
This release of IBM Parallel Environment contains a number of functional
enhancements.
The PE for AIX 5.1 enhancements are:
v For improved performance of on-node barrier synchronization, support for the
IBM Power (POWER6) server barrier synchronization register (BSR) has been
added. Note that you must be running 64-bit programs over the AIX 6.1
operating system, on IBM Power (POWER6) servers to utilize the BSR support.
Note also that the MPI library will not use the BSR if checkpointing is enabled
About this information xiii
with the AIX environment variable CHECKPOINT. For more information, see
IBM Parallel Environment: MPI Programming Guide.
v The default value of the MP_PRIORITY_LOG environment variable has
changed from yes to no, so that the log file is produced only when it is needed.
v Beginning with PE 5.1, support for the PE Benchmarker has been removed. This
includes the Performance Collection Tool (PCT), the Performance Visualization
Tool (PVT), and the Unified Trace Environment (UTE) utilities uteconvert,
utemerge, utestats, traceTOslog2.so, and slogmerge.
v To replace the performance analysis function of the PE Benchmarker, PE
introduces the IBM High Performance Computing (HPC) Toolkit. The IBM HPC
Toolkit is an integrated software environment that addresses the performance
analysis, tuning, and debugging of sequential and parallel scientific applications.
It consists of a collection of tools that optimize the application by monitoring its
performance on the processor, memory, and network. The IBM HPC Toolkit is
appropriate for users with varying degrees of parallel programming experience.
For more information, see IBM Parallel Environment: Operation and Use.
v Beginning with PE 5.1, the pdbx debugger function has been removed. Instead,
AIX users can now use the PDB debugger, previously available only with PE for
Linux.
v With Version 5.1, PE introduces additional type checking for Fortran 90 codes.
PE now includes a Fortran 90 module that provides type checking for MPI
programs at compile time. This allows programmers to find and resolve errors at
a much earlier stage.
v PE 5.1 enhances performance by providing a separate buffer for collective
communication early arrival messages. Similar to MP_BUFFER_MEM for
point-to-point communications, a new environment variable,
MP_CC_BUF_MEM, allows users to control the amount of memory PE MPI
allows for the buffering of early arrival message data for collective
communications.
Note: In PE 5.1, the early arrival buffer that is controlled by MP_CC_BUF_MEM
is used by MPI_Bcast only. Early arrival messages in other collective
communication operations continue to use the early arrival buffer for
point-to-point communication that is controlled by MP_BUFFER_MEM.
v PE 5.1 is compliant with the revisions listed in the Annex B Change-Log of the
MPI 2.1 standard.
The PE for Linux 5.1 enhancements are:
v The Parallel Operating Environment (POE) priority adjustment coscheduler,
previously available only for AIX users, is now supported by PE for Linux.
v With Version 5.1, PE introduces additional type checking for Fortran 90 codes.
PE now includes a Fortran 90 module that provides type checking for MPI
programs at compile time. This allows programmers to find and resolve errors at
a much earlier stage.
v PE 5.1 introduces the IBM High Performance Computing (HPC) Toolkit. The
IBM HPC Toolkit is an integrated software environment that addresses the
performance analysis, tuning, and debugging of sequential and parallel scientific
applications. It consists of a collection of tools that optimize the application by
monitoring its performance on the processor, memory, and network. The IBM
HPC Toolkit is appropriate for users with varying degrees of parallel
programming experience. For more information, see IBM Parallel Environment:
Operation and Use.
xiv IBM PE for AIX and Linux V5 R1: Installation
v Beginning with PE 5.1, PDB is now available with both PE for Linux and PE for
AIX.
v PE 5.1 enhances performance by providing a separate buffer for collective
communication early arrival messages. Similar to MP_BUFFER_MEM for
point-to-point communications, a new environment variable,
MP_CC_BUF_MEM, allows users to control the amount of memory PE MPI
allows for the buffering of early arrival message data for collective
communications.
Note: In PE 5.1, the early arrival buffer that is controlled by MP_CC_BUF_MEM
is used by MPI_Bcast only. Early arrival messages in other collective
communication operations continue to use the early arrival buffer for
point-to-point communication that is controlled by MP_BUFFER_MEM.
v PE 5.1 is compliant with the revisions listed in the Annex B Change-Log of the
MPI 2.1 standard.
About this information xv
xvi IBM PE for AIX and Linux V5 R1: Installation
Chapter 1. Introducing PE 5.1
The Parallel Environment for AIX licensed program product is a set of software
components that help you develop, debug, analyze, and run parallel Fortran, C, or
C++ programs on a cluster of networked servers. Based on the Parallel
Environment for AIX, the Parallel Environment for Linux also provides support for
parallel application development and execution running Linux.
Before installing PE, you should be familiar with its software components. See IBM
Parallel Environment: Operation and Use.
For the latest information, always review the PE product readme file included with
the PE RPMs.
PE components
PE is made up of various components, including a message passing API and the
parallel operating environment (POE).
The PE components are:
Message passing and collective communication API subroutine libraries
These libraries, which contain subroutines that help application developers
parallelize their code, are described in IBM Parallel Environment: MPI
Programming Guide. For additional information about MPI, see the IBM
Parallel Environment: MPI Subroutine Reference and the Parallel Environment:
MPI Programming Guide.
Parallel operating environment (POE)
This software helps ease your transition from serial to parallel processing
by hiding many of the differences and allowing you to continue using
standard AIX or Linux tools and techniques. When you start a parallel job,
the POE partition manager contacts the remote nodes, begins running your
code, and oversees the job’s operation.
For more information, refer to IBM Parallel Environment: Operation and Use.
PDB A command line debugger for parallel programs that works together with
Distributed Interactive Shell (DISH), a tool for launching and managing
distributed processes interactively, as well as GDB, the GNU project
debugger (for Linux) and dbx, a UNIX-based debugger (for AIX).
PE documentation
This component is made up of man pages for all of the MPI subroutines,
and PE commands and functions. For AIX users, these are included in the
ppe.man fileset. For Linux users, they are included as part of the PE RPM.
You can view, search, and print the most recent Parallel Environment
documentation in PDF and HTML format from the IBM Cluster
information center and the IBM Publications Center on the Web.
PE can also be used with LookAt, which is an online facility that lets you
look up explanations for most of the IBM messages you encounter, as well
as for some system abends and codes.
© Copyright IBM Corp. 1993, 2008 1
2 IBM PE for AIX and Linux V5 R1: Installation
Chapter 2. Planning to install the PE software
When planning to install the Parallel Environment software, you need to ensure
that you have met all of the necessary system requirements. You also need to think
about what your programming environment will be and the strategy for using that
environment.
PE for AIX installation requirements
There are various system requirements for installing and running the PE software,
including requirements for hardware, software, disk space, and filesets.
PE for AIX hardware requirements
PE for AIX 5.1.0 is supported on the following hardware:
v IBM Power Systems servers
v IBM BladeCenter® Power Architecture® servers
Total fixed disk storage requirements for the machine are based on the licensed
programs and user applications you install. See “Disk space requirements for AIX
installation” on page 6 for more information.
PE for AIX software requirements
The software required for PE includes a variety of PE components plus, in some
cases, additional software. You need to decide which PE components to install on
your system based on the PE features you plan to use. You may also need to install
some additional products or components, based on how you plan to use PE.
AIX operating system requirements
One of the following AIX operating systems environments is required for PE
installation.
v AIX Version 5.3 TL (program number 5765-G03) with Recommended
Maintenance Package 5300-09
v AIX Version 6.1 (or later), either standalone or connected via an Ethernet LAN
supporting IP for IBM Power Systems servers.
The following AIX filesets are required:
v bos.adt.base
v bos.adt.syscalls
v bos.rte.libc
v bos.cpr
PE fileset requirements for AIX installation
Table 4 on page 4 lists the PE 5.1.0 filesets. Decide which of these filesets to install
on the various nodes in your system, based on the PE component options you plan
to use.
v For more information about nodes, see “Node resources” on page 11.
v For information about installing the following product options individually, see
“PE for AIX installation procedure summary” on page 20.
© Copyright IBM Corp. 1993, 2008 3
Table 4. PE Fileset requirements
If you plan to: This product option
is required:
Fileset name: Things to consider:
Develop and run parallel
applications from a node
Parallel Operating
Environment
ppe.poe MPI is part of POE.
When POE is installed, it adds entries to the
/etc/services and /etc/inetd.conf files. When
POE is run, a copy of the partition manager
daemon is run on each remote node and is
identified by these files.
If you are using NIS or another master server
for /etc/services, you need to update the
individual files with the same information.
Use an interactive
command line debugger
PDB ppe.shell Because PDB on AIX uses dbx for interactive
debugging, the bos.adt.debug AIX fileset is
required.
Access the online
documentation in man
page format
PE man pages ppe.man None
Accept the eLicense
agreement during
installation of PE 5.1
filesets
PE license ppe.loc.license While not required to be installed,
ppe.loc.license must be installed in the same
location as the other PE install images in order
to accept the license agreement.
Additional software requirements for AIX installation
Table 5 lists the additional software products or filesets that are required by PE
5.1.0. You need to decide which of these software products or filesets to install on
your system, based on how you plan to use PE.
Table 5. Additional software requirements
If you plan to: This software is required: Things to consider:
Run a parallel program
on an IBM Power
Systems processor-based
server cluster
The rsct.lapi.rte fileset. For AIX 5.3,
rsct.lapi.rte 2.4.6 is required. For AIX
6.1, rsct.lapi.rte 3.1.2 is required. PE
5.1 does not support earlier versions
of LAPI.
Contains the communication protocol libraries for
LAPI and MPI.
rsct.lapi.rte 3.1.2 (for AIX 6.1) is included on the
PE product CD. If you are using AIX 5.3, you must
obtain rsct.lapi.rte 2.4.6 from the AIX 5.3 operating
system packages.
For information on installing rsct.lapi.rte, see
RSCT: LAPI Programming Guide.
4 IBM PE for AIX and Linux V5 R1: Installation
Table 5. Additional software requirements (continued)
If you plan to: This software is required: Things to consider:
Compile parallel
executables
IBM C for AIX Version 6.0 (program
number 5765-F57)
or
VisualAge® C++ Professional for
AIX, Version 6.0 (program number
5765-F56). This compiler should only
be used with IBM POWER5™
processor-based servers. It should
not be used to compile programs
that will be run on IBM Power
(POWER6™) servers.
or
IBM XL C/C++ Enterprise Edition
for AIX compiler (formerly known as
the VisualAge C/C++ for AIX
compiler), Version 8.0.0.12
(5724-M12), or later. This compiler
should only be used with IBM
Power (POWER6) servers . It should
not be used to compile programs
that will be run on IBM POWER5
processor-based servers.
or
IBM XL C/C++ Enterprise Edition
Version 7.0 for AIX or later,
(program number 5724-I11).
or
IBM XL Fortran for AIX Version 9.1
or later, (program number 5724-I08).
Note that in order to use the Fortran
90 type checking module, provided
by PE, you must compile your
application with the XLF 12.1 (or
later) compiler.
IBM C for AIX Version 6 is now part of VisualAge
C++ Professional for AIX, Version 6.0, and is also
available as a separate fileset.
VisualAge C++ Professional for AIX, Version 6.0
and IBM XL Fortran for AIX Version 9.1 support
the latest IBM POWER5 architecture.
IBM XL C/C++ Enterprise Edition for AIX
compiler (formerly known as the IBM VisualAge
C/C++ for AIX compiler), Version 8.0.0.12
(5724-M12) supports the latest IBM Power
(POWER6) architecture.
Submit a POE job from
outside a LoadLeveler
cluster
loadl.so on the node outside the
LoadLeveler cluster
See “When to install the loadl.so (LoadLeveler)
fileset” on page 19 for detailed information.
Submit an OpenMP job
and request task affinity
XL Fortran Version 11 PTF1 (or later)
or XL C/C++ Version 9.0 (or later)
Supports the procs suboption of the OpenMP
XLSMPOPTS environment variable
Use the pdb debugger The bos.adt.debug fileset None
Use the TWS
LoadLeveler to submit
interactive POE User
Space jobs or allow
execution of batch jobs
LoadLeveler Version 3.5, 5765-D61
(or later)
When TWS LoadLeveler is installed, PE 5.1.0
requires LoadL.full 3.5 to run with the latest
features. See “Coexistence” on page 40 and
“Migration support” on page 40 for more
information.
Chapter 2. Planning to install the PE software 5
Table 5. Additional software requirements (continued)
If you plan to: This software is required: Things to consider:
Use the IBM Power
(POWER6) barrier
synchronization register
(BSR) with 64–bit MPI
programs.
The rsct.lapi.bsr fileset. The BSR is only available with the IBM Power
(POWER6) servers, running the AIX 6.1 operating
system. For information on installing rsct.lapi.bsr,
see RSCT: LAPI Programming Guide. For
information on enabling the PE MPI library to use
the BSR, see “Enabling the barrier sychronization
register (BSR)” on page 28.
Analyze or tune program
performance.
The IBM High Performance
Computing Toolkit. The required
filesets are ppe.hpct and
ppe.hpct.rte, which are included on
the PE installation media.
The IBM High Performance Computing Toolkit is a
separately installable component of the PE product.
The installation and user guide for the IBM High
Performance Computing Toolkit can be found by
following the link to the documentation section on
the HPC Central Wiki home page
(http://www.ibm.com/developerworks/wikis/display/hpccentral/HPC+Central).
Disk space requirements for AIX installation
Table 6 lists the amount of disk space you need in the appropriate directories for
each of the separately-installable PE product options.
If you plan to install the PE software on an IBM Power Systems or network cluster,
each machine in the cluster must meet the disk space requirements shown in
Table 6.
Table 6. Disk space requirements for installation
PE File set Number of 512-Byte Blocks Required in Directory:
/usr /tmp /etc
ppe.man 4500 not applicable not applicable
ppe.poe 35000 500 30
ppe.shell 22000 not applicable not applicable
Note: Temp space required for installation is 128MB in /tmp.
PE for Linux installation requirements
There are various system requirements for installing and running the PE software,
including requirements for hardware, software, disk space, and RPMs.
PE for Linux hardware requirements
PE for Linux 5.1.0 is supported on the following hardware:
v IBM Power Systems servers
v IBM System x servers
v IBM BladeCenter (Power Architecture and X-Architecture®) servers
PE for Linux software requirements
The software required for PE includes a variety of PE components plus, in some
cases, additional software. You need to decide which PE components to install on
your system based on the PE features you plan to use. You may also need to install
some additional products or components, based on how you plan to use PE.
6 IBM PE for AIX and Linux V5 R1: Installation
Supported Linux distributions
PE 5.1 requires one of the following Linux distributions:
v SUSE LINUX Enterprise Server (SLES) 10
v Red Hat Enterprise Linux 5, Update 2 (or later).
PE RPMs required for Linux installation
Table 7 lists the RPMs required for installation, based on your platform.
Note: In Table 7, the RPMs shown for the IBM IBM Power Systems platform also
apply to IBM BladeCenter platform.
Table 7. RPMs required for installation
Platform RPM Type
IBM Power Systems servers IBM_pe_license-5.1.1.0-BuildLevel.ppc.rpm License
IBM System x 64-bit IBM_pe_license-5.1.1.0-BuildLevel.x86.rpm License
IBM Power Systems servers with
SLES 10
ppe_ppc_base_32bit_sles1000-5.1.1.0-BuildLevel.ppc.rpm PE
ppe_ppc_64bit_sles1000-5.1.1.0-BuildLevel.ppc64.rpm
ppe_pdb_ppc-2.0.0.0-BuildLevel.rpm
ppe_hpct_sles1000-5.1.0.0.rpm
ppe_hpct_runtime_sles1000-5.1.0.0.rpm
lapi_ppc_32bit_base_IP_sles1000-3.1.2.0-BuildLevel.ppc.rpm LAPI
lapi_ppc_64bit_IP_sles1000-3.1.2.0-BuildLevel.ppc64.rpm
lapi_ppc_32bit_US_sles1000-3.1.2.0-BuildLevel.ppc.rpm
lapi_ppc_64bit_US_sles1000-3.1.2.0-BuildLevel.ppc64.rpm
IBM Power Systems servers with
RH 5
ppe_ppc_base_32bit_rh500-5.1.1.0-BuildLevel.ppc.rpm PE
ppe_ppc_64bit_rh500-5.1.1.0-BuildLevel.ppc64.rpm
ppe_pdb_ppc_rh500-2.0.0.0-BuildLevel.rpm
ppe_hpct_rh500-5.1.0.0.rpm
ppe_hpct_runtime_rh500-5.1.0.0.rpm
lapi_ppc_32bit_base_IP_rh500-3.1.2.0-BuildLevel.ppc.rpm LAPI
lapi_ppc_64bit_IP_rh500-3.1.2.0-BuildLevel.ppc64.rpm
lapi_ppc_32bit_US_rh500-3.1.2.0-BuildLevel.rpm
lapi_ppc_64bit_US_rh500-3.1.2.0-BuildLevel.rpm
IBM System x 64-bit with SLES 10 ppe_x86_base_32bit_sles1000-5.1.1.0-BuildLevel.x86.rpm PE
ppe_x86_64bit_sles1000-5.1.1.0-BuildLevel.x86_64.rpm
ppe_pdb_x86-2.0.0.0-BuildLevel.rpm
lapi_x86_32bit_base_IP_sles1000-3.1.2.0-BuildLevel.x86.rpm LAPI
lapi_x86_64bit_IP_sles1000-3.1.2.0-BuildLevel.x86_64.rpm
IBM System x 64-bit with RH 5 ppe_x86_base_32bit_rh500-5.1.1.0-BuildLevel.x86.rpm PE
ppe_x86_64bit_rh500-5.1.1.0-BuildLevel.x86_64.rpm
ppe_pdb_x86_rh500-2.0.0.0-BuildLevel.rpm
lapi_x86_32bit_base_IP_rh500-3.1.2.0-BuildLevel.x86.rpm LAPI
lapi_x86_64bit_IP_rh500-3.1.2.0-BuildLevel.x86_64.rpm
Chapter 2. Planning to install the PE software 7
Some RPMs also include the word base in their names to indicate that they are the
major RPMs for that component. You must install a base RPM before other RPMs
of a component. However, you must install the base RPM after the license RPM,
when one exists.
Additional software requirements for Linux installation
PE 5.1 may also require you to install some of the software products listed below,
depending on how you plan to use PE. You need to examine the features and
functions offered by each of the products listed below, to determine if they are
required in order for you to use PE as you intended.
The additional software required for PE 5.1 includes:
v xinetd (eXtended InterNET Daemon) must be installed on the system, and when
a node is rebooted xinetd must be restarted.
v A working C compiler. Parallel Environment supports parallel program
development using the following compilers. One working C compiler is required
on each node. Without a working C compiler, Parallel Environment may not be
installed properly. See the appropriate documentation for installing and
configuring a C compiler.
– IBM compilers for Linux (available only for the supported IBM servers).
- IBM C compiler, V7.0.1-0 (vac.cmp-7.0.1-0.rpm)
- IBM C++ compiler, V7.0.1-0 (vacpp.cmp-7.0.1-0.rpm)
- IBM Fortran compiler, V9.1.1-0 (xlf.cmp-9.1.1-0.rpm), or later– GNU compilers for Linux (available for all servers that support PE for Linux.
- C compiler, V3.4.4-2 (gcc-3.4.4-2.rpm)
- C++ compiler, V3.4.4-2 (gcc-c++-3.4.4-2.rpm)
- Fortran compiler, V3.4.4-2 (gcc-g777-3.4.4-2.rpm). Note that this compiler
only supports Fortran 77 compilation.
Note: Installing the IBM C/C++ or the Fortran compilers by themselves will not
make them usable. You must also perform the configuration step. The
configuration is available in the compiler’s readme file.
v rsct.lapi.rte 2.4.2 (or later) - contains the communication protocol libraries for
LAPI and MPI to run a parallel program. This fileset is now included on the PE
5.1 product CD.
v Tivoli® Workload Scheduler LoadLeveler® Version 3.5.0.0-0, 5765-E69 (or later)
- to submit interactive or batch applications, as well as manage network
resources.
The following is a list of the Tivoli Workload Scheduler LoadLeveler full product
RPMs. You must install one of these RPMs, based on the operating system and
hardware platform you are using:
– LoadL-full-RH5-X86-3.5.0.0-0.i386.rpm
– LoadL-full-RH5-X86_64-3.5.0.0-0.x86_64.rpm
– LoadL-full-RHEL5-PPC64-3.5.0.0-0.ppc64.rpm
– LoadL-full-SLES10-X86-3.5.0.0-0.i386.rpm
– LoadL-full-SLES10-X86_64-3.5.0.0-0.x86_64.rpm
– LoadL-full-SLES10-PPC64-3.5.0.0-0.ppc64.rpm
Note that if you run your applications over User Space, Tivoli Workload
Scheduler LoadLeveler is required.
8 IBM PE for AIX and Linux V5 R1: Installation
In addition to the regular Tivoli Workload Scheduler LoadLeveler RPM, one of
the following 32-bit LoadLeveler RPMs is also needed on the IBM Power
Systems, IBM BladeCenter, and System x 64-bit platforms:
– For IBM Power Systems and IBM BladeCenter hardware with RH5:
LoadL-full-lib-RHEL5-PPC-3.5.0.0-0.ppc.rpm
– For For IBM Power Systems and IBM BladeCenter hardware with SLES10:
LoadL-full-lib-SLES10-PPC-3.5.0.0-0.i386.rpm
– For System x 64-bit hardware with RH5: LoadL-full-lib-RH5-X86-3.5.0.0-0.i386.rpm
– For System x 64-bit hardware with SLES10: LoadL-full-lib-SLES10-X86-3.5.0.0-0-i386.rpm
See the Tivoli Workload Scheduler LoadLeveler documentation for more
information.
v Compilers (as listed in “PE for Linux software requirements” on page 6)
v Parallel Debugging Tool (PDB) -
PDB is required if you plan to debug parallel programs. You can install PDB
after you have installed the required PE RPMs. PDB is packaged in its own
RPM, which is included on the product CD. Because PDB is
platform-independent, you can install it on any of the supported Linux
distributions.
For information on installing PDB, see “Installing the Parallel Debugging Tool
(PDB) RPM manually” on page 30
v IBM High Performance Computing Toolkit (HPC Toolkit)- The IBM HPC
Toolkit is required if you plan to analyze and tune program performance. The
HPC Toolkit is a separately installable component of the PE product. The
installation images, named ppe_hpct_*.rpm, are included on the PE installation
media. The installation and user guide for the IBM HPC Toolkit can be found by
following the link to the documentation section on the HPC Central Wiki home
page (http://www.ibm.com/developerworks/wikis/display/hpccentral/HPC+Central).
Installation images for Linux installation
The product CD includes all of the RPMs that are required to install PE on either
the Red Hat 5 or SLES 10 version of Linux. In addition, there is a license RPM, a
readme file, an installation utility script, and a version file, which is read by the
install script. All of the PE and LAPI RPMs are included on the product CD. For a
complete list of the PE and LAPI RPM file names, refer to Table 7 on page 7.
The general CD installation procedure involves mounting the CD onto a file
system. If you are installing PE on a cluster, you can either copy the content of the
CD into a shared file system or export the CD directly. You must install the PE
license RPM and accept the license agreement before installing the other PE RPMs.
You must accept the PE license in order to successfully complete the installation.
Note also that you must install and accept the PE license RPM before you can run
your applications.
For maintenance releases between GAs, PE is also available in compressed (TAR)
file format for download from the Web. Like the CDs, there is a TAR file for the
IBM System x platform and another TAR file for the IBM Power Systems, and IBM
BladeCenter platforms. The content of the TAR files is identical to the CDs. After
you uncompress the TAR file and extract the information from it, the installation
procedure is exactly the same as that described for the product CD. You may
Chapter 2. Planning to install the PE software 9
download images from the IBM Parallel Environment Web site
(http://www14.software.ibm.com/webapp/set2/sas/f/penv/home.html).
Disk space requirements for Linux installation
Table 8 lists the amount of disk space you need in the appropriate directories for
each of the separately-installable PE product options.
If you plan to install the PE software on an IBM cluster, each machine in the
cluster must meet the disk space requirements shown in Table 8.
Table 8. RPM and disk space requirements for installation
All Platforms Preinstall MB Post-install MB
License RPM 48 MB Not applicable
PE and LAPI RPMs 5 MB 15 MB in /opt, 250KB in /etc
Parallel Debugging
Tool (PDB) RPMs
1.6 MB 3.5 MB
Note: Temp space required for installation is 128MB in /tmp.
PE Limitations
Some PE product options and related software are subject to certain limitations, as
explained in this section.
MPI-IO parallel file I/O
MPI-IO in PE MPI is targeted to the IBM General Parallel File System™
(GPFS™) for production use. File access through MPI-IO normally requires
that a single GPFS file system image be available across all tasks of an MPI
job. PE MPI with MPI-IO can be used for program development on any
other file system that supports a POSIX interface (for AIX, AFS®, DFS™,
JFS, or NFS, and for Linux, a local file system or NFS) as long as all tasks
run on a single node or workstation. This is not expected to be a useful
model for production use of MPI-IO. PE MPI can be used without all
nodes on a single file system image by using the MP_IONODEFILE
environment variable. See IBM Parallel Environment: Operation and Use for
information about MP_IONODEFILE.
Parallel applications and system calls
User-written parallel applications are limited in their use of system calls.
See IBM Parallel Environment: MPI Programming Guide for a discussion of
these limitations.
Information for the system administrator
For system administrators, it is important to understand software compatibility for
PE and how to plan out your node resources. You will also need to determine
which nodes in your cluster will require which filesets.
For additional information about POE system administration tasks, refer to
Chapter 7, “Additional information for the system administrator,” on page 63.
10 IBM PE for AIX and Linux V5 R1: Installation
Software compatibility within workstation clusters
For all processors within a workstation cluster, the same release level (including
maintenance levels) of PE software is required. (This ensures that an individual PE
application can run on any workstation in the cluster.)
About upgrading AIX without upgrading compilers
Many of the compilers link to different libraries based on the AIX OSLEVEL value
when they are installed. If you migrate just AIX, you will be using libraries for a
back level. Be sure to change the compiler library links or reinstall compilers.
LAPI and MPI library compatibility in PE 5.1
With PE for Linux, MPI and LAPI are released as one package. Mixing MPI and
LAPI libraries from different releases is not supported.
MPI and LAPI share a common transport layer, therefore MPI applications are
dependent upon LAPI being previously installed in order to compile and execute
MPI programs. You install the LAPI fileset or RPMs, which are included on the PE
product CD, as part of the standard PE installation procedure.
All the nodes can participate in processing parallel jobs. In doing so, all nodes
must have compatible levels of the LAPI and MPI libraries installed, particularly
when nodes are upgraded with new versions/releases of the libraries and when
service is applied that affects the libraries. In all cases, the same version, release,
and service level of the LAPI and MPI libraries must be installed on all nodes that
are to participate in a parallel job.
PE 5.1 requires LAPI 2.4.6 (or later) for AIX 5.3 and LAPI 3.1.2 (or later) for AIX
6.1. Earlier versions of LAPI are not supported.
For more information on installing PE, LAPI, and AIX on previously installed
systems, refer to “Migrating and upgrading PE for AIX” on page 39. For more
information on installing PE, LAPI, and Linux on previously installed systems,
refer to “Migrating and upgrading PE for Linux” on page 42.
Node resources
How you plan your node resources will vary according to whether you are
installing PE on an IBM Power Systems cluster or a Linux cluster, with or without
LoadLeveler.
On a cluster using LoadLeveler
The system administrator uses LoadLeveler to partition nodes into pools or
features or both, to which he or she assigns names or numbers and other
information. The workstation from which parallel jobs are started is called
the home node and it can be any workstation on the LAN.
On a cluster without LoadLeveler
On an IBM Power Systems cluster without LoadLeveler, you assign nodes
or servers to the following categories:
v Home node (workstation from which parallel jobs are started) for running
the Partition Manager in POE
v Nodes or servers for developing and compiling applications
v Nodes or servers for executing applications in the parallel environment
You must identify the nodes or servers running as execution nodes by
name in a host list file.
Chapter 2. Planning to install the PE software 11
Deciding which nodes require which PE filesets or RPMs, or
additional software
An important aspect of planning your PE node resources is deciding which nodes
require which PE filesets or RPMs, or additional software. You do not need to
install all of the PE filesets or RPMs on every node.
If you are using PE for AIX, refer to “PE for AIX software requirements” on page 3
for more information on the filesets and their dependencies. This information will
help you decide how to install PE and additional required software on your nodes.
If you are using PE for Linux, refer to “PE for Linux software requirements” on
page 6 and “PE RPMs required for Linux installation” on page 7 for more
information on the RPMs and their dependencies. This information will help you
decide how to install PE and additional required software on your nodes.
Enabling xinetd for Linux installation
In order for the Partition Manager daemon to run and to provide the appropriate
remote user access and authorization, the Linux system needs to enable the xinetd
daemon and to configure it to restart automatically during each system reboot.
If xinetd is not running on your system, start it with the following command:
/etc/ini.d/xinitd restart
Note that when a node is rebooted, the xinetd service must be restarted.
File systems
With PE for AIX, the PE filesets are installed in the /usr file system. When the
ppe.poe fileset is installed, it adds entries to the /etc/services and /etc/inetd.conf
files. With PE for AIX, the PE filesets are installed in the /usr file system. When the
ppe.poe fileset is installed, it adds entries to the /etc/services and /etc/inetd.conf
files.
With PE for Linux, the PE filesets are installed in the /opt/ibmhpc directory. When
the base 32-bit PE RPM is installed, it adds entries to the /etc/services and
/etc/xinetd.d/pmv5 files.
When poe is executed, a copy of the Partition Manager daemon is run on each
remote node, and is identified in these files.
If you are using NIS or another master server for /etc/services, you need to create
updates with the same information that is put into the individual files.
If you do not use a shared file system, you need to copy the user’s executable files
to the other nodes. To copy these files, you can use the PE message-passing-file
copy command, mcp. If you are using PE for AIX, you can also use the CSM
commands, dsh and pcp. For more information about copying the file system and
about mcp, see Parallel Environment: Operation and Use. For more information about
dsh and pcp, see IBM Cluster Systems Management: Command and Technical Reference.
If you are using PE for AIX, you can also manage files as part of Cluster System
Management’s (CSM) Configuration File Manager. With CSM, the Configuration File
12 IBM PE for AIX and Linux V5 R1: Installation
Manager provides a file repository for configuration files that are common across
all nodes in a cluster. For more information, see Cluster System Management:
Administration Guide.
User IDs on remote nodes
On each remote node, the system administrator must set up a user ID, other than a
root ID, for each user on each remote node who will be executing serial or parallel
applications or who requires POE access.
See IBM Parallel Environment: Operation and Use for an introduction of home and
remote nodes.
Each user must have an account on all nodes where a job runs. Both the user name
and user ID must be the same on all nodes. Also, the user must be a member of
the same named group on the home node and the remote nodes.
PE for AIX user authorization
There are several options for PE user authorization. You can use the POE security
method, which is based on the Cluster Security Services of IBM RSCT, Cluster
based security, or AIX based security (the default).
POE security method configuration
PE 5.1 uses an enhanced set of security methods, based on Cluster Security
Services in RSCT. POE has a security configuration option for the system
administrator to determine which set of security methods are to be used in the
system. There are two types of security methods supported:
v cluster based security (or CTSec)
v AIX based security (or Compatibility, which is the default)
When POE is installed, the /etc/poe.security file on each node will contain an entry
defining the type of security method to be used on that node. For more
information see the description of /etc/poe.security in Chapter 7, “Additional
information for the system administrator,” on page 63.
The use of the CTSec method will require the installation of the rsct.core.sec fileset,
along with its proper configuration. For more information, see “Cluster based
security configuration.”
The use of the POE security method applies only when POE is used without
LoadLeveler. When LoadLeveler is used (which includes all User Space jobs),
LoadLeveler determines and enforces the security method, and POE will not check
the security method.
Cluster based security configuration
When Cluster Based Security is the security method of choice, the system
administrator will have to ensure that UNIX® Host Based authentication is enabled
and properly configured on all nodes. This entails:
v /usr/sbin/rsct/cfg/unix.map file exists with proper entries
v Host based authentication (HBA) is installed and configured on the nodes
v Proper public/private key set up for all of the nodes
Chapter 2. Planning to install the PE software 13
Refer to the RSCT Administration Guide for specific details. From a user’s point of
view, when Cluster Based Security is used, users will be required to have the
proper entries in the /etc/hosts.equiv or .rhosts files, in order to ensure proper
access to each node, as described in “AIX-based security (compatibility).”
AIX-based security (compatibility)
When AIX-based security (compatibility) is the security method of choice, (which is
also the default), POE relies on the use of AIX-based user authorization, as
described below.
If AIX user authorization, or compatibility, (the default) is used as a security
mechanism on the system, each node needs to be set up so that each user ID is
authorized to access that node or remote link from the initiating home node. Use
the /etc/hosts.equiv file and/or the .rhosts file to specify this user ID
authorization, as explained below.
If the combination of the home node machine and user name:
v is authorized in /etc/hosts.equiv on the remote node, the user is authorized to run
parallel tasks there.
v is disallowed in /etc/hosts.equiv on the remote node, the user is not able to run
parallel tasks there.
v does not appear in /etc/hosts.equiv, the combination is checked in the .rhosts file
in the user’s home directory on the remote node. If the user name and the home
node machine combination appears in .rhosts, the user is authorized to run
parallel tasks on the remote node.
For more information on .rhosts and /etc/host.equiv, see the chapter on managing
jobs in IBM AIX Files Reference.
If you are using LoadLeveler to submit POE jobs, including all User Space
applications, LoadLeveler is responsible for the security authentication. The
security function in POE is not invoked when POE runs under LoadLeveler.
PE for Linux user authorization
Under Linux, PE supports a limited set of user authorization mechanisms. The
/etc/poe.security file defines the security mechanism which, on Linux, is limited to
the COMPAT entry. Each node needs to be set up such that each userid is
authorized to access that node or remote link from the initiating home node.
Use the /etc/hosts.equiv and/or the .rhosts file to specify this user ID
authorization, as explained below.
If the combination of the home node machine and user name:
v is authorized in /etc/hosts.equiv on the remote node, the user is authorized to run
parallel tasks there.
v is disallowed in /etc/hosts.equiv on the remote node, the user is not authorized to
run parallel tasks there.
v does not appear in /etc/hosts.equiv, the combination is checked in the .rhosts file
in the user’s home directory on the remote node. If the user name and the home
node machine combination appears in .rhosts, the user is authorized to run
parallel tasks on the remote node.
14 IBM PE for AIX and Linux V5 R1: Installation
Tuning your Linux system for more efficient parallel job performance
The Linux default network and network device settings may not produce optimum
throughput (bandwidth) and latency numbers for large parallel jobs. The
information provided in this topic describes how to tune the Linux network and
certain network devices for better parallel job performance.
This information is aimed at private networks with high performance network
devices such as the Gigabit Ethernet network, and may not produce similar results
for 10/100 public Ethernet networks.
Table 9 provides examples for tuning your Linux system for better job
performance. By following these examples, it is possible to improve the
performance of a parallel job running over an IP network.
Table 9. Network tuning
Network Tuning Factors Tuning for the current boot session Modifying the system permanently
arp_ignore - With arp_ignore set to 1,
a device only answers to an ARP
request if the address matches its
own.
echo ’1’ > /proc/sys/net/ipv4/conf/all/arp_ignore
Add this line to the /etc/sysctl.conf
file:
net.ipv4.conf.all.arp_ignore = 1
arp_filter - With arp_filter set to 1,
the kernel only answers to an ARP
request if it matches its own IP
address.
echo ’1’ > /proc/sys/net/ipv4/conf/all/arp_filter
Add this line to the /etc/sysctl.conf
file:
net.ipv4.conf.all.arp_filter = 1
rmem_default - Defines the default
receive window size.
echo ’1048576’ > /proc/sys/net/core/rmem_default
Add this line to the /etc/sysctl.conf
file:
net.core.rmem_default = 1048576
rmem_max - Defines the maximum
receive window size.
echo ’2097152’ > /proc/sys/net/core/rmem_max
Add this line to the /etc/sysctl.conf
file:
net.core.rmem_max = 2097152
wmem_default - Defines the default
send window size.
echo ’1048576’ > /proc/sys/net/core/wmem_default
Add this line to the /etc/sysctl.conf
file:
net.core.wmem_default = 1048576
wmem_max - Defines the maximum
send window size.
echo ’2097152’ > /proc/sys/net/core/wmem_max
Add this line to the /etc/sysctl.conf
file:
net.core.wmem_max = 2097152
Set device txqueuelen - Sets each
network device, for example eth0,
eth1, and so on.
/sbin/ifconfig device_interface_name
txqueuelen 4096
Not applicable
Turn off device interrupt coalescing -
To improve latency.
See sample script. This script must be
run after each reboot.
Not applicable
This sample script unloads the e1000 Gigabit Ethernet device driver and reloads it
with interrupt coalescing disabled:
#!/bin/ksh
Interface=eth0
Device=e1000
Kernel_Version=`uname -r`
ifdown ${Interface}
rmmod ${Device}
insmod /lib/modules/${Kernel_Version}/kernel/drivers/net/${Device}/${Device}.ko \
InterruptThrottleRate=0,0,0
ifconfig ${Interface}
exit $?
Chapter 2. Planning to install the PE software 15
MPI jobs use shared memory to handle intranode communication. You may need
to modify the system default for allowable maximum shared memory size to allow a
large MPI job to successfully enable shared memory usage. It is recommended that
you set the system allowable maximum shared memory size to 256MB for supporting
large MPI jobs.
To modify this limit for the current boot session, execute the following command
as root:
echo "268435456" > /proc/sys/kernel/shmmax
To modify this limit permanently, add the following line to the /etc/sysclt.conf file
and reboot the system:
kernel.shmmax = 268435456
Running large POE jobs and IP buffer usage (PE for AIX only)
A POE application may require additional IP buffers (mbufs) under any of the
following circumstances:
v PE job uses more than 128 nodes.
v Large amounts of STDIO (stdin, stdout, or stderr) are generated.
v The home node is running many POE jobs simultaneously, or there is significant
additional IP traffic via mounted file system activity (or other sources), or both.
v Many large messages are passed via the UDP/IP implementation of the Message
Passing Library.
The need for additional IP buffers is usually evident when repeated requests for
memory are denied. Using the netstat -m command can tell you when such a
condition exists. In such a case, it may be necessary to use the no command to
change the network option system parameters on the home node. You can use the
no command to initially check the values as well.
The number of IP buffers allocated in the kernel is controlled by the thewall
parameter of the no command. Increasing the value of the thewall parameter
increases the number of IP buffers.
v You must have root authority to change options with the no command, and the
setting applies to all processes running on the node on which it is executed.
You can also set the values at system boot time by adding the appropriate call to
the no command in either /etc/rc.net or /etc/rc.tcpip.
For more information on mbufs, see IBM AIX Performance Management Guide.
16 IBM PE for AIX and Linux V5 R1: Installation
Chapter 3. Installing the Parallel Environment
You can install the Parallel Environment with either the AIX or Linux operating
system. The procedures for installing PE for AIX and PE for Linux are very
different.
If you are using PE for AIX, refer to “Installing the PE for AIX software.” If you
are using PE for Linux, refer to “Installing the PE for Linux software” on page 28.
Installing the PE for AIX software
To install PE, you first install the desired PE filesets on a single node. When that
installation is complete, you can then replicate the installation image throughout
the remaining nodes, using one of the suggested methods described in this
information.
PE is enabled for AIX electronic licensing capability. The ppe.loc.license fileset
must be present on the same install media or in the same directory as the PE
filesets to be installed in order for the license agreement to be processed during the
installation of that fileset. The installer must also specify the proper option to
confirm that the license has been accepted, in order for the fileset to be properly
installed.
About installing PE for AIX with CSM
To install the desired PE filesets on an IBM Power Systems cluster cluster running
CSM, you install the software on each node individually using SMIT or installp.
Note that you must first install the PE filesets on at least one node of your system.
CSM cannot be installed in the same node in the cluster. For more information on
CSM, refer to IBM Cluster Systems Management for AIX: Planning and Installation
Guide and IBM Cluster Systems Management for AIX: Administration Guide.
About installing PE for AIX on an IBM Power Systems cluster
Installation on an IBM Power Systems network cluster without CSM will not
provide system management functions. This leaves you with the following two
options:
v Use the PEinstall script.
v Install the software on each system individually using SMIT or installp.
In either case, first install the PE filesets on at least one system in your cluster.
When this is complete, you can replicate the installation image to your other
nodes.
During the course of installing PE filesets on a cluster, you may encounter sysck
warning messages. These messages may indicate that a particular file is also
owned by another fileset. If the file is also owned by one of the older PE filesets,
such as PE Version 2 ppe.poe, this may indicate that an older version is installed.
You can ignore these warning messages and the system will function properly.
However, if you later choose to remove the old fileset after installing PE Version 5,
you need to reinstall the new fileset.
© Copyright IBM Corp. 1993, 2008 17
Migration installation
If you migrate from PE Version 3, or Version 4 to PE Version 5, installing the new
filesets will completely replace some of the earlier release filesets, rendering them
obsolete. The replaced filesets will be marked OBSOLETE in the object data
manager (ODM) and lslpp by installp.
However, some directories and installation files will remain. Because these earlier
filesets do not coexist or run with PE Version 5, you should uninstall your old filesets
before installing the new PE filesets, rather than installing the new filesets on top of
the old. This will conserve disk space and reduce the chance for confusion over old
fileset path names and executables.
CAUTION:
If you plan to uninstall the old filesets, do so before installing the new filesets. If
you attempt to uninstall the old filesets after installing PE Version 5, you may
accidentally delete some needed files.
Table 10 lists the old filesets that need to be removed before you install PE Version
5:
Table 10. Filesets to remove before installation
PE Version Filesets to be removed
2
ppe.pedocs
ppe.vt
ppe.xpdbx
3
ppe.html
ppe.pdf
4
ppe.pct
ppe.pvt
ppe.dpcl
Determining which earlier filesets are installed
You can use the lslpp command to check if any of the filesets are installed. For
example, lslpp -l poe will tell you if the Version 1 POE fileset is installed.
Removing earlier filesets
To remove filesets you can use any of the following methods:
v SMIT
Use the Maintain Installed Software dialog found under the Software Installation
and Maintenance dialog.
v installp command; for example:
installp -u poe
v PEdeinstall script
See “Removing a software component” on page 47.
18 IBM PE for AIX and Linux V5 R1: Installation
When to install the rsct.lapi.rte fileset
If you are using an IBM Power Systems cluster and plan to run parallel MPI or
LAPI applications, you must install rsct.lapi.rte before or after installing PE
Version 5, in order for parallel applications to run.
rsct.lapi.rte 3.1.2 (for AIX 6.1 users) is included on the PE product CD. If you are
using AIX 5.3, you must obtain rsct.lapi.rte 2.4.6 from the AIX 5.3 operating system
packages.
For information on installing rsct.lapi.rte, see RSCT: LAPI Programming Guide.
When to install the rsct.lapi.bsr fileset
If you are running 64–bit MPI applications on the AIX 6.1 operating system on an
IBM Power (POWER6) server cluster, and you want to use the barrier
synchronization register (BSR), you must install rsct.lapi.bsr before or after
installing PE Version 5.
The rsct.lapi.bsr 3.1.2 file set (for AIX 6.1 users) is included on the PE product CD.
PE does not support the barrier synchronization register with AIX 5.3.
For information on installing rsct.lapi.bsr, see RSCT: LAPI Programming Guide.
When to install the rsct.core.sec fileset
If you plan to use the cluster based security methods based on Cluster Security
Services in RSCT, you must also install the rsct.core.sec fileset, and perform the
appropriate configuration steps.
See “POE security method configuration” on page 13 for more information.
When to install the loadl.so (LoadLeveler) fileset
Install this fileset to submit a POE job which uses LoadLeveler from a node
outside of the LoadLeveler cluster. To install, do the following:
1. Contact the system administrator of your LoadLeveler cluster to determine the
path name to the exported directory containing the loadl.so image.
2. NFS-mount that directory on the submitting node.
3. Install the loadl.so fileset using the following command:
installp -aFXd device loadl.so
4. Obtain the LoadLeveler configuration file as described in Tivoli Workload
Scheduler LoadLeveler: Using and Administering.
View the readme file before installation
Before you actually install any fileset, you may want to look at its readme file. The
readme file may contain some special or additional information about installing the
fileset. The PE filesets are all shipped with a copy of the readme as part of the first
file on the CD. This allows you to view the readme using the installp -i command
and option.
If you decide after reading the readme that you would like to refer to the file later,
once the fileset is installed, you can find the readme file in the
/usr/lpp/fileset/READMES directory. The file will have a name of fileset.README.
Chapter 3. Installing the Parallel Environment 19
PE for AIX installation procedure summary
Table 11 summarizes the basic steps you must follow to install the PE software on
an IBM Power Systems network cluster.
You can install all of the PE filesets at once, or you can install selected filesets one
at a time. To determine which filesets, if any, that you want to install separately,
see “PE fileset requirements for AIX installation” on page 3.
Table 11. Installation procedure summary
If you are installing
ppe.poe:
If you are installing
ppe.man:
Perform™ these steps:
X X “Step 1: Copy the software to a hard disk for installation
over a network”
X X “Step 2: Perform the initial installation” on page 21
X X “Step 3: Install PE for AIX on other nodes” on page 24
X Not applicable “Step 4: Verify the POE installation” on page 26
(optional step)
Install the PE for AIX filesets step-by-step
This section provides the step-by-step procedure for installing the PE software on
an IBM Power Systems network cluster. Each step includes one or more tables that
guide you through choices about variables. In some cases, they refer to the use of
nodes with CSM, or without.
Pay close attention to these tables as you proceed through the procedure, because
they may direct you to skip certain steps.
1. Before beginning the installation procedure, be sure to do the following:
a. Login as root.
b. If you already have an earlier version of PE installed, remove the earlier
version. (See “Removing a software component” on page 47.)
c. Verify that all prerequisite software is installed.2. A discussion of SMIT options assumes that a fast path to the install software
screen is installed. Otherwise follow the SMIT path to the custom install screen.
Step 1: Copy the software to a hard disk for installation over a
network
This step consists of copying the installation images off the distribution medium
and exporting the installation directory, thereby making the installation images
available for mounting. You must complete this step if any of the machines in your
cluster do not have the proper installation device to read the distribution medium.
Note: If you already have an earlier version of PE installed, remove the earlier
version before proceeding. (See “Removing a software component” on page
47.)
Substep 1: Copy the software off the distribution medium:
To copy the PE software off the distribution medium, follow these instructions:
INSERT
the distribution medium in the installation device.
20 IBM PE for AIX and Linux V5 R1: Installation
ENTER
smit bffcreate
This command invokes SMIT, and takes you to the window for copying
software to a hard disk for future installation over the network.
PRESS
List
A window opens listing the available INPUT devices and directories for
software.
SELECT
the installation device from the list of available INPUT devices.
The window listing the available INPUT devices closes and the original
SMIT window indicates your selection.
PRESS
Do
The SMIT window displays the default parameters used for copying
software to a hard disk.
TYPE IN
all in the SOFTWARE name field.
TYPE IN
/usr/sys/inst.images in the DIRECTORY for storing software field. This is
the installation directory name.
PRESS
Do
The system copies the PE software installation images to the directory.
SELECT
Exit → Exit SMIT
The SMIT window closes.
Substep 2: Export the installation directory:
To export the directory so the machines in your cluster can install the PE
installation images it contains, enter /usr/sbin/mknfsexp -d /usr/sys/inst.images
Step 2: Perform the initial installation
This step consists of initially installing the PE installation image, using either of the
following methods:
v via the installp command
v via the installation menus of the System Management Interface Tool (SMIT)
Either method allows you to specify whether you want to install all of the PE
software filesets or just certain individual filesets.
Keep in mind that some of the PE filesets depend on others to run. Refer to “PE
fileset requirements for AIX installation” on page 3, which details these
dependencies, before you do a partial installation.
Table 12 on page 22 helps you determine the steps you need to perform to initially
install the PE installation image.
Chapter 3. Installing the Parallel Environment 21
Table 12. Step 2 for installing with CSM
If you are installing with CSM: If you are installing on an IBM Power
Systems network cluster without CSM:
Perform this step on the initial node. You
must login as root.
Perform this step on any machine in the
cluster. You must login as root.
Method 1: Use the installp command:
Table 13 shows the appropriate commands to enter to initially install the
installation image.
Table 13. Method 1: Use the installp command
To install: ENTER
all software filesets installp -a -d devicename ppe*
just the man fileset installp -a -I -X -Y -d devicename ppe.man
just the POE fileset installp -a -I -X -Y -d devicename ppe.poe
just the PDB fileset installp -a -I -X -Y -d devicename ppe.shell
In the commands above:
-I (capital I)
is used to select only the specified fileset.
-a applies the software products.
-X attempts to expand any file systems where there is insufficient space to do
the installation.
-Y accepts the eLicense.
-d devicename
is the name of the installation device or directory.
The system reads and receives the installation image off the distribution medium.
Method 2: Use SMIT:
To initially install the installation image using SMIT, follow these instructions:
INSERT
the distribution medium in the installation device unless you are installing
over a network.
ENTER
smit install_latest
This command invokes SMIT, and takes you directly to its window for
installing software.
PRESS
List
A window opens listing the available INPUT devices and directories for
software.
SELECT
the installation device or directory from the list of available INPUT
devices.
22 IBM PE for AIX and Linux V5 R1: Installation
The window listing the available INPUT devices and directories closes and
the original SMIT window indicates your selection.
PRESS
Do
The SMIT window displays the default install parameters.
TYPE The appropriate file name, as shown in Table 14:
Table 14. File names for different types of installations
If you want to install: Type this in the SOFTWARE to install field:
All the PE software ppe*
Just the man fileset ppe.man
Just the POE fileset ppe.poe
Just the PDB fileset ppe.shell
After choosing the appropriate software, you may also want to change
other options on the panel, as needed. For example, the panel also asks
whether or not you want to expand the file systems. When you are
prompted, answer yes to expand the file systems.
TYPE IN
yes in the ACCEPT new license agreements? field. If the eLicense is not
accepted, none of the PE software components will be installed.
PRESS
Do
The system installs the installation image.
For more information on SMIT, see IBM AIX General Programming Concepts: Writing
and Debugging Programs.
If installation fails: If the installation is unsuccessful, a software product cleanup
procedure is automatically called. The cleanup procedure removes any files that
may have been restored from the distribution medium, and backs out of any
post-installation procedure that may have been started.
To help determine the cause of the unsuccessful installation, refer to the installation
status file. This file indicates how far installation had progressed when the errors
occurred. IBM AIX General Programming Concepts: Writing and Debugging Programs
describes the status file in more detail. If you cannot determine the cause of a
failed installation, contact your local IBM representative.
Determine remaining tasks:
You have completed the initial installation of PE. For a description of the
directories, files, and daemon processes created and the links established when the
installation image was received, see Chapter 6, “Understanding how installing PE
alters your system,” on page 53.
To determine the remaining steps you need to perform, refer to Table 15 on page
24.
Chapter 3. Installing the Parallel Environment 23
Table 15. Steps to take to determine steps remaining
If there are other nodes in your system on
which you need to install PE filesets:
If there are not any other nodes in your
system on which you need to install PE
filesets:
Proceed to
v “Step 3: Install PE for AIX on other nodes”
Skip:
v “Step 3: Install PE for AIX on other nodes”
If appropriate, proceed to:
v “Step 4: Verify the POE installation” on
page 26
Step 3: Install PE for AIX on other nodes
This step consists of installing PE on other nodes, using either of the following
methods:
v running one of the installation scripts provided with PE
v manually
Perform this step, as root, from a node with PE installed.
Method 1: Use the PE for AIX installation script:
This method consists of:
v creating a host list file (a list of the remaining nodes on which you want to
install PE)
v running the PEinstall installation script
Substep 1: Create a host list file
To create a host list file, follow these instructions:
1. Open a new file using any AIX text editor.
By default, the installation script looks for a file named host.list in your
current directory. You can, however, name the host list file anything
you want. If you do choose to give your file a different name, you will
have to specify that file name when you run the installation script.
2. In the file, enter one node host name on each line. For example:
hostname1
hostname2
hostname3
hostname4
hostname5
Substep 2: Run the PEinstall installation script with the -copy or -mount option
To run the installation script, enter PEinstall image_name [host_list_file]
[-copy | -mount].
Note:
1. To execute the installp remotely on a mounted image, the
directory containing the image must have world-writable
permissions (as created by the chmod 777 command).
If you do not want to create this directory with world-writable
permissions, do not use the -mount option of PEinstall.
2. To have the image copied or mounted to different directories,
you will need to invoke PEinstall for each different location or
24 IBM PE for AIX and Linux V5 R1: Installation
set of locations. The host list file that you specify each time you
invoke PEinstall should reflect only those nodes that you want
to use with -copy or -mount.
Table 16 shows the information you need to provide, depending on
whether you specify the -copy or -mount option.
Table 16. Specify -copy and -mount
If you specify the -copy option, you will be
prompted for:
If you specify the -mount option, you will
be prompted for:
v the installation image source directory. The
default is /usr/sys/inst.images.
v the installation image destination directory
which is used for all nodes in the host list.
The default is /usr/sys/inst.images.
v the installation image source directory. The
default is /usr/sys/inst.images.
v the remote node mount point directory,
which is used for all nodes in the host list.
The default is /mnt.
v whether you want the script to
automatically create the remote mount
directory
If your remote mount directory already
exists:
Answer no to this prompt.
Note: Be sure that you have
issued the chmod 777 command
on this directory.
If your remote mount directory does not
already exist:
Answer yes to this prompt.
PEinstall issues a mkdir
command for the directory name
specified, followed by a chmod
777.
Substep 3: Specify the fileset(s) to be installed
When you are prompted for the name of the fileset you want to install,
enter the appropriate file name, as shown in Table 17:
Table 17. File names for different data types
If you want to install: Type this when prompted:
all the PE software all
just the man fileset ppe.man
just the POE fileset ppe.poe
Just the PDB fileset ppe.shell
For each node in the host list, PEinstall executes the following installp
command:
installp -aYFX -d/image_directory/image_name fileset
This command installs both the usr and root portion of the fileset in the
image specified.
Chapter 3. Installing the Parallel Environment 25
Errors that may occur during installation: The following severe installation errors
will cause the installation process to terminate completely:
v The host list file cannot be found.
v No installation image name was specified.
For other errors, a message may appear describing the error, and then processing
will continue. The same message will be logged in a file named PEnode.log in the
current working directory. If you see error messages, look in this file, as the node
on which the error occurred is always displayed and logged. This helps you
identify any nodes on which the fileset(s) did not get successfully installed. When
you correct the errors, you can then rerun the PEinstall script just for those nodes.
Method 2: Installing PE for AIX manually:
As a system administrator, you may want to have more control over the
installation of PE, and install it manually to other nodes, using SMIT or installp.
During “Step 1: Copy the software to a hard disk for installation over a network”
on page 20, you created an installation image that you can use to replicate the
installation of PE file sets on the other nodes of your system. By making this image
available to the other nodes, either by copying or mounting the image file, you can
use SMIT or installp to install the image.
The installation image of PE file sets does not require any special consideration.
You may use SMIT or installp as described in “Method 1: Use the installp
command” on page 22. You can also set up a host list file, and run installp via rsh,
and install the PE file sets on multiple nodes.
Determine remaining tasks:
You have completed installing PE on the other nodes in your system.
To determine which remaining steps you need to perform, refer to Table 18:
Table 18. Steps to take to determine steps remaining
If you installed POE: If you did not install POE:
Proceed to:
v “Step 4: Verify the POE installation”
Skip:
v “Step 4: Verify the POE installation”
Step 4: Verify the POE installation
This step consists of testing the installation of POE, using the POE Installation
Verification Program (IVP). You can find this program in /usr/lpp/ppe.poe/samples/ivp.
Note: In order to successfully run the IVP, you will need to have rsct.lapi.rte
already installed.
To run the POE IVP, at the control workstation (or other home node):
LOGIN
as a user other than root, and start ksh.
ENTER
export LANG=C
26 IBM PE for AIX and Linux V5 R1: Installation
ENTER
cd /usr/lpp/ppe.poe/samples/ivp
ENTER
./ivp.script
This runs an installation verification test that checks if the message-passing
program successfully executed using two tasks on this node. The output
should resemble the following:
Verifying the existence of the Binaries
Partition Manager daemon /etc/pmdv5 is executable
POE files seem to be in order
Compiling the ivp sample program
Output files will be stored in directory /tmp/ivp495786
Creating host.list file for this node
Setting the required environment variables
Executing the parallel program with 2 tasks
Threaded 32bit library built on: Apr 21 2003 12:51:46 level(CS2A_Pre-build).
POE IVP: running as task 0 on node c284f2ih01
POE IVP: there are 2 tasks running
POE IVP: running as task 1 on node c284f2ih01
POE IVP: all messages sent
POE IVP: task 1 received <POE IVP Message Passing Text>
Parallel program ivp.out return code was 0
Executing the parallel program with 2 tasks, threaded library
Threaded 32bit library built on: Apr 21 2003 12:51:46 level(CS2A_Pre-build).
POE IVP_r: running as task 0 on node c284f2ih01
POE IVP_r: there are 2 tasks running
POE IVP_r: all messages sent
POE IVP_r: running as task 1 on node c284f2ih01
POE IVP_r: task 1 received <POE IVP Message Passing Text -
Threaded Library>
Parallel program ivp_r.out return code was 0
If both tests return a return code of 0, POE IVP
is successful. To test system message passing,
run the tests in /usr/lpp/ppe.poe/samples/poetest.bw and poetest.cast
To test threaded message passing,
run the tests in /usr/lpp/ppe.poe/samples/threads
End of IVP test
If errors are encountered, your output contains messages that describe
these errors. You can correct the errors and run the ivp.script again, if
desired.
Additional POE sample applications –
POE also has sample applications for doing the following:
v Point-to-point bandwidth measurement tests
v Broadcast from task 0 to all of the rest of the nodes in the partition
v MPI Threads sample programs
See Chapter 10, “Using additional POE sample applications,” on page 83 for more
information.
View the readme file after installation –
Chapter 3. Installing the Parallel Environment 27
Once you have installed the PE filesets, refer to the readme file provided with each
fileset for any additional installation or usage information. You can find the
readme file in /usr/lpp/fileset/READMES as fileset.README.
For information about other procedures related to PE installation, see Chapter 5,
“Performing installation-related tasks,” on page 47.
Performing PE for AIX post installation tasks (optional)
After performing the basic PE installation, there are additional tasks that you may
or may not want to perform, depending on your installation.
Enabling the barrier sychronization register (BSR)
This task explains how to enable the MPI library to use of the barrier
sychronization register (BSR). The BSR is only available on IBM Power servers
(POWER6), running 64–bit MPI applications over AIX 6.1.
To enable the BSR, do the following:
v Ensure that the rsct.lapi.bsr fileset has been installed. For more information, see
RSCT: LAPI Programming Guide. Note that the same level of rsct.lapi.bsr must be
installed on each of the nodes in the cluster.
v Set the PE MP_SINGLE_THREAD environment variable to yes.
v Enable the BSR through the Hardware Management Console (HMC). For
information on how to do this, see System i™ and System p™: Partitioning for AIX
with an HMC (http://publib.boulder.ibm.com/infocenter/systems/topic/iphbk/iphbkbook.pdf).
Installing the PE for Linux software
There are two methods for installing PE for Linux. Both methods require you to
install the appropriate RPMs, which are organized based on platform and Linux
distribution.
There are two methods for installing PE for Linux:
v A manual installation, which is described in “Installing the PE and LAPI RPMs
manually” on page 30.
v An automated installation, which is described in “Installing PE for Linux using
the pe_install.sh script” on page 31.
Both methods require you to install the appropriate RPMs, which are organized
based on platform and Linux distribution. A list of required RPMs is located in
“PE RPMs required for Linux installation” on page 7. Note that you must install a
platform-independent license RPM before installing the Parallel Environment
RPMs.
At a high level, the manual installation process includes the following steps:
1. Installing the LAPI component RPMs.
2. Installing and configuring an appropriate C compiler.
3. Installing the PE product license RPM.
4. Installing the PE component RPMs.
5. Performing post installation tasks. (Optional)
a. Installing additional software on the nodes.
28 IBM PE for AIX and Linux V5 R1: Installation
b. Verifying the installation.
1) Running the installation verification program (IVP) script.
2) Executing the POE sample applications.c. Viewing the readme file.
At a high level, the automated installation process includes the following steps:
1. Installing and configuring an appropriate C compiler.
2. Installing the PE and LAPI component RPMs, including the product license
RPM, using the automated installation script pe_install.sh.
3. Performing post installation tasks. (Optional)
a. Installing additional software on the nodes.
b. Verifying the installation.
1) Running the installation verification program (IVP) script.
2) Executing the POE sample applications.c. Viewing the readme file.
Installing PE for Linux manually
To install the IBM Parallel Environment manually, first refer to the list of required
RPMs located in “PE RPMs required for Linux installation” on page 7. Then, install
the following components in the order shown:
1. 32-bit LAPI base IP RPM
2. 64-bit LAPI IP RPM, if applicable
3. 32-bit LAPI US RPM, if applicable
4. 64-bit LAPI US RPM, if applicable
5. Appropriate C compiler
6. PE license RPM
7. 32-bit PE base RPM
8. 64-bit PE RPM, if applicable
Installing the PE license RPM manually
You must install and accept the PE license to successfully install PE. You must do
this on each node of a cluster. PE also checks the current PE license during run
time. Note also that you must install the PE license before installing the PE
component base RPM.
The PE license RPM is large because the package includes a Java™ runtime
environment which is required by the license acceptance process. The license
installation and acceptance process uses temporary space in the /tmp directory
(about 128MB).
You may install the license on a single node by using the rpm command manually.
With this method, you must accept the license using the license acceptance shell
script. For example:
1. Login as root.
2. Enter: rpm -i IBM_pe_license-5.1.0.0-BuildLevel.x86.rpm
3. You see the following messages:
Installing IBM PE License...
IBM PE License RPM is installed. To accept PE LICENSE please run...
/opt/ibmhpc/install/sbin/accept_ppe_license.sh
Chapter 3. Installing the Parallel Environment 29
4. Invoke the license acceptance shell script by entering /opt/ibmhpc/install/sbin/accept_ppe_license.sh
a. You see the license agreement statement. Enter 1 to accept the license
agreement.
Installing the PE and LAPI RPMs manually
To install the PE and LAPI RPMs, you can use the Linux rpm command. In
general, IBM only supports the -i, -U and -e RPM options. When you have
completed the installation, verify that it was successful. See “Verifying the POE
installation” on page 35 for more information.
The following is an example of installing the PE and LAPI RPMs manually. It
assumes that you are attempting an installation on AMD 64-bit hardware, running
Red Hat 5Linux.
Note: The RPM names shown below are only examples of the LAPI and PE
RPM names. The actual RPM names will closely resemble these examples,
but the build level and the release and version numbering may be slightly
different. Refer to the list of RPMs located in “PE RPMs required for Linux
installation” on page 7.
1. Install the PE license RPM and accept the license agreement, if you have not
done so already.
2. Login as root.
3. Enter rpm -i lapi_x86_32bit_base_IP_rh500-2.4.6.0-BuildLevel.x86.rpm
4. Enter rpm -i lapi_x86_64bit_IP_rh500-2.4.6.0-BuildLevel.x86_64.rpm
5. Enter rpm -i lapi_x86_32bit_US_rh500-2.4.6.0-BuildLevel.x86.rpm
6. Enter rpm -i lapi_x86_64bit_US_rh500-2.4.6.0-BuildLevel.x86_64.rpm
7. Enter rpm -i ppe_x86_base_32bit_rh500-5.1.0.0-BuildLevel.x86.rpm
/opt/ibmhpc/ppe.poe/bin/mpiexec will not be linked to /usr/bin
/usr/bin/mpiexec already existed
Stopping xinetd: [ OK ]
Starting xinetd: [ OK ]
8. Enter rpm -i ppe_x86_64bit_rh500-5.1.0.0-BuildLevel.x86_64.rpm
9. If you are using an IBM BladeCenter server, you must stop and then restart the
xinetd daemon manually on each of the blades. To do this, enter
/etc/init.d/xinetd restart.
In the installation example above, we assumed that the PE license was already
installed. You can see that besides the PE license RPM, the base 32-bit PE RPM is
the only other RPM that produces messages during installation.
Installing the Parallel Debugging Tool (PDB) RPM manually
The Parallel Debugging Tool (PDB) cannot be installed automatically using the PE
for Linux installation script (pe_install.sh). If you plan to debug parallel programs,
you must install PDB manually. To install PDB, follow the steps, below.
1. Verify that PE for Linux has been successfully installed.
2. Identify the PDB RPM that you need to install. The RPMs are located on the PE
product media. The PDB RPMs are:
v For IBM Power Systems servers with SLES 10: ppe_pdb_ppc-2.0.0.0-BuildLevel.rpm
30 IBM PE for AIX and Linux V5 R1: Installation
v For IBM Power Systems servers with RH 5: ppe_pdb_ppc_rh500-2.0.0.0-BuildLevel.rpm
v For IBM System x 64-bit with SLES 10: ppe_pdb_x86-2.0.0.0-BuildLevel.rpm
v For IBM System x 64-bit with RH 5: ppe_pdb_x86_rh500-2.0.0.0-BuildLevel.rpm
3. Use the rpm -i command to install the RPM you selected. For example:
rpm -i ppe_pdb_ppc_rh500-2.0.0.0-BuildLevel.rpm
Installing PE for Linux using the pe_install.sh script
An installation script, pe_install.sh, is available to help make the install process
easier. This script automatically determines which set of RPMs to use. This is done,
in part, by reading the PE_LAPI_VERSION file, which is included with the PE
software RPMs, to determine the build levels of the associated PE and LAPI RPM
files. A new version of this file is shipped with every maintenance release to allow
you to install software update packages using the pe_install.sh script. For more
information on using the installation script, see Chapter 8, “Syntax of commands
for running installation and deinstallation scripts,” on page 75, or see the online
help by entering: pe_install.sh -h.
Note that the Parallel Debugging Tool (PDB) cannot be installed using the
pe_install.sh script. For information about installing PDB, see “Installing the
Parallel Debugging Tool (PDB) RPM manually” on page 30.
When you use the installation script, you can choose to install only the license
RPM, or to install the license RPM and all the RPMs for the PE and LAPI
components.
The following sections describe how to use the installation script:
v “Installing the PE license RPM using the pe_install.sh script”
v “Installing PE for Linux on a single node using the pe_install.sh script
interactively” on page 32
v “Installing PE for Linux on a single node using the pe_install.sh script in batch
mode” on page 34
v “Installing PE for Linux on multiple nodes using the pe_install.sh script in batch
mode” on page 35
Installing the PE license RPM using the pe_install.sh script
These instructions describe the process for installing the PE license RPM and
accepting the license agreement on a single node using the pe_install.sh script in
interactive mode. Note that although these instructions shows you how to perform
these tasks using the installation script in interactive mode, you can also perform
the same tasks using the installation script in batch mode. The difference with
batch mode is that you are not given a chance to review the license agreement.
Therefore, if using the pe_install.sh script to install the PE license RPM on
multiple nodes, you should perform the installation at least once in interactive
mode so you can review the license agreement. Afterwards, you may use the script
in batch mode to automatically accept the license agreement without reviewing it
again. This is especially useful when you use the script with rsh, dsh or ssh to
install RPMs across the network.
Chapter 3. Installing the Parallel Environment 31
You can use this script to install only the PE license RPM, or to install the PE
license RPM along with the PE and LAPI product RPMs.
To install the PE license RPM and accept the license agreement using the
installation script in interactive mode:
1. Login in as root.
2. Call the install script by entering ./pe_install.sh -a
a. At the following prompt, enter n:
Do a full installation of IBM Parallel Environment for Linux?
Enter ’y’ to install all PE and LAPI RPMs (default),
or ’n’ to install just the PE License RPM:
b. At the following prompt, enter y:
Do you want to review the PE License Agreement and manually register your
acceptance of the terms, or just have your acceptance automatically registered
for you without reviewing the license agreement?
Enter ’y’ to review it (default), or ’n’ to automatically accept the terms:
c. If the license RPM is in the current directory press <Enter> when you see
this prompt. Otherwise specify a path:
Please specify the directory where the RPM files are located.
Hit the Enter key to use /install/PE_images,
or enter the full path of the correct directory:
d. Enter y to continue, or n to stop:
Start IBM PE RPM installations now (last chance to abort)?
Hit the <Enter> key or enter ’y’ to continue with the installs, or
enter any other character to exit this script without installing:
e. If you entered y, you see the license agreement.
f. When you see the license agreement enter 1 to accept or 2 to reject the
agreement.
g. If you entered 1 you see:
IBM PE License agreement accepted
Licenses stored in /etc/opt/ibmhpc/license
h. If you entered 2, the installation script prompts you one more time. If you
enter 2 again (to reject the license), the script ends without installing the PE
license.
Installing the PE and LAPI RPMs on a single node using the
pe_install.sh script
There are two ways you can install the PE and LAPI RPMs on a single node using
the script; you can install PE either interactively or in batch mode. If you want to
install PE on multiple nodes using the script, refer to “Installing PE for Linux on
multiple nodes using the pe_install.sh script in batch mode” on page 35.
Installing PE for Linux on a single node using the pe_install.sh script
interactively:
These instructions describe installing PE on a single node with the script running
interactively. This example also assumes that the PE license RPM has not been
installed. If the license RPM has already been installed, the install script displays a
message informing you of this, and skips steps 2a and 2b.
To install PE using the installation script interactively:
1. Login as root.
2. Invoke the installation shell script by entering ./pe_install.sh -a
32 IBM PE for AIX and Linux V5 R1: Installation
a. At the following prompt, enter y to perform a full installation (the PE
license RPM if it is not already installed, as well as the LAPI and the PE
RPMs):
Do a full installation of IBM Parallel Environment for Linux?
Enter ’y’ to install all PE and LAPI RPMs (default),
or ’n’ to install just the PE License RPM:
b. If the PE license has already been installed, you will not see the following
prompt. Otherwise, if you have previously reviewed the license agreement
(during a previous installation), enter n:
Do you want to review the PE license Agreement and manually register your
acceptance of the terms, or just have your acceptance automatically registered
for you without reviewing the license agreement?
Enter ’y’ to review it (default), or ’n’ to automatically accept the terms:
c. At the next prompt enter i (for a new installation):
Do you want to perform new RPM installations, update existing RPMs, or install
a fix to specific RPMs?
Enter ’i’ for new installs (default), ’U’ for update installs, or
’f’ for fix installs:
d. When you see the following prompt, answer y to choose both IP and User
Space support, or n to choose IP only support.
Do you want to install both IP and User Space protocol support?
Enter ’y’ for both IP & US support, or ’n’ (default) for IP support only:
e. If the RPMs are in the current directory just press <Enter> for the following
prompt. Otherwise specify the path:
Please specify the directory where the RPM files are located.
Hit the Enter key to use /install/PE_images,
or enter the full path of the correct directory:
f. At the following prompt, enter y to continue:
Start IBM PE RPM installations now (last chance to abort)?
Hit the <Enter> key or enter ’y’ to continue with the installs, or
enter any other character to exit this script without installing:
You should see output similar to this as the installation progresses:
Start installing IBM Parallel Environment . . . .
Installing IBM PE License. . .
IBM PE License accepted quietly.
IBM PE License agreement accepted
Licenses stored in /etc/opt/ibmhpc/license
Successfully installed IBM LAPI 32bit BASE IP RPM
Successfully installed IBM LAPI 32bit US RPM
/opt/ibmhpc/ppe.poe/bin/mpiexec will not be linked to /usr/bin
/usr/bin/mpiexec already existed
Stopping xinetd: [ OK ]
Starting xinetd: [ OK ]
Successfully installed IBM PE 32bit BASE RPM
Successfully installed IBM LAPI 64bit IP RPM
Successfully installed IBM LAPI 64bit US RPM
Successfully installed IBM PE 64bit RPM
Installation of IBM Parallel Environment completed.
The installation has completed successfully. The output in step 2f represents
a case in which the PE license RPM was not installed previously, and is
being installed quietly as the installation script executes. The installation of
this RPM is indicated by the following messages that were displayed in the
example output listed in 2f:
Installing IBM PE License...
IBM PE License accepted quietly.
IBM PE License agreement accepted
Licenses stored in /etc/opt/ibmhpc/license
Chapter 3. Installing the Parallel Environment 33
The following message appears in certain situations, but is not an error. It
simply shows that a version of MPI by another vendor has already been
installed on this node.
/opt/ibmhpc/ppe.poe/bin/mpiexec will not be linked to /usr/bin
/usr/bin/mpiexec already existed
3. If you are using an IBM BladeCenter server, you must start and stop the xinetd
daemon manually on each of the blades. To do this, enter /etc/init.d/xinetd
restart.
The installation output shown in step 2f includes messages which show that
the xinetd daemon was stopped and then restarted. This occurred during the
installation of the 32-bit base PE RPM because part of that installation process
involved updating the /etc/services file, and defining the new Partition
Manager Daemon. Note that the xinetd daemon was restarted on the server but
not the individual blades because the RPMs only get installed on the server.
Installing PE for Linux on a single node using the pe_install.sh script in batch
mode:
These instructions describe how to install PE on a single node with the
pe_install.sh script running in BATCH mode. When invoked in batch mode, the
installation script attempts to perform a FULL installation (the PE license RPM as
well as the PE and LAPI RPMs). If the script finds that the PE license RPM is
absent on a node, it installs this RPM and accepts the license agreement quietly,
under the assumption that you have reviewed this agreement and accept its terms.
If the script finds that the PE license is already installed, it skips the installation of
this RPM and proceeds to install the other components.
To run the script in batch mode, do not use the -a flag of the pe_install.sh script.
Each of the script’s flags that are related to batch mode have an associated default
value. The important flags are:
-dir For specifying where the RPM files are located. The default value is the
current directory
-install_op
Acceptable values are either i for new installations or U for upgrade
installations or installing emergency fixes. To install an emergency fix to a
particular RPM, you must also specify the -fix_level flag with the proper
fix level information. The default value for this flag is i for a new
installation.
-fix_level
This flag is for installing emergency fixes. With this flag, you can update
one or more of the PE or LAPI RPMs with RPMs containing emergency
fixes. A fix is specified by its component name and a fix level. Valid
component names are either pe or lapi. The fix level is specified in the
format ver.rel.mod.fix-BuildLevel. The two specifiers are separated by a
hyphen. For example:
pe-5.1.0.0-0611a
Specifying the -fix_level flag without also specifying the -install_op flag
implies that -install_op had been specified with the U install operation
value.
34 IBM PE for AIX and Linux V5 R1: Installation
-ip_only
Acceptable values are either n for installing both IP and US protocol
support, or y for installing IP only. The default is y.
To perform a full installation for just IP support on the local node, with all RPMs
located in the current directory, you can enter ./pe_install.sh, without specifying
any flags, and the default values will be used.
Installing PE for Linux on multiple nodes using the pe_install.sh
script in batch mode
This section describes how to install PE on multiple nodes of a cluster using the
pe_install script in batch mode, in conjunction with rsh, dsh, or ssh. You must
perform this type of installation as root, with a command line call. For example:
dsh ’./pe_install.sh -dir rpm_directory’
With dsh, you must specify the -dir flag because the command runs in the root
directory, and the script must know where the RPMs are located.
With rsh or ssh, you need to write a simple script and run it as root. The script
should look similar to this:
rsh ’hostname ./pe_install.sh -dir rpm_directory’
and run the script as root.
Performing PE for Linux post installation tasks (optional)
After performing the basic PE installation, there are additional tasks that you may
or may not wish to perform, depending on your installation. These tasks include
installing additional software, testing the installation of POE, and checking the
readme file for additional installation information.
For information on installing additional software on the nodes, such as Fortran and
C++ compilers, Tivoli Workload Scheduler LoadLeveler, and the Parallel
Debugging Tool PDB) see “Additional software requirements for Linux
installation” on page 8.
Verifying the POE installation
This task explains how to test the installation of POE, using the POE Installation
Verification Program (IVP). You can find this program in /opt/ibmhpc/ppe.poe/samples/ivp.
Running the Installation Verification Program (IVP) script:
To run the POE IVP at the control workstation (or other home node):
LOGIN
as a user other than root.
ENTER
cd /opt/ibmhpc/ppe.poe/samples/ivp
ENTER
./ivp.linux.script
Chapter 3. Installing the Parallel Environment 35
This runs an installation verification test that checks to see if the message
passing program successfully executed using two tasks on this node. The
output should resemble the following:
Verifying the existence of the Binaries
Partition Manager daemon /etc/pmdv5 is executable
POE files seem to be in order
Compiling the ivp sample program
Output files will be stored in directory /tmp/ivp14777
Creating host.list file for this node
Setting the required environment variables
Executing the parallel program with 2 tasks
POE IVP: running as task 0 on node c171f6sq10
POE IVP: running as task 1 on node c171f6sq10
POE IVP: there are 2 tasks running
POE IVP: all messages sent
POE IVP: task 1 received <POE IVP Message Passing Text>
Parallel program ivp.out return code was 0
If the test returns a return code of 0, POE IVP
is successful. To test message passing,
run the tests in /opt/ibmhpc/ppe.poe/samples/poetest.bw and poetest.cast
To test threaded message passing,
run the tests in /opt/ibmhpc/ppe.poe/samples/threads
End of IVP test
If the IVP script encounters errors, your output contains messages that
describe these errors. You can correct the errors and run the
ivp.linux.script again, if desired.
POE sample applications: POE also includes sample applications for doing the
following:
v Point-to-point bandwidth measurement tests.
v Broadcast from task 0 to the rest of the nodes in the partition.
v MPI Threads sample programs.
See Chapter 10, “Using additional POE sample applications,” on page 83 for more
information.
Viewing the readme file after installation
After you have installed the PE filesets, refer to the readme file provided with each
fileset for additional installation or usage information. You can find the readme file
in /opt/ibmhpc/ppe.poe/READMES/poe.README.
Resolving installation errors
This task describes the errors that might occur during installation, and how to
resolve them.
An installation error may be caused by the absence of a working C compiler. You
must have a working IBM C/C++ compiler or a GNU C compiler for successful
installation of the 32-bit base PE RPM. Otherwise, some of the parallel utilities, in
particular mcp, mcpgath and mcpscat, may not get installed. If you see an error
message during installation that reports these utilities as missing, do the following:
1. Install and configure a C compiler.
36 IBM PE for AIX and Linux V5 R1: Installation
2. Call the mpcc compiler script to complete the link edit step for these three
utilities, as follows:
cd /opt/ibmhpc/ppe.poe/bin
mpcc -o mcp /opt/ibmhpc/ppe.poe/samples/mpi/mcp.o
mpcc- o mcpgath /opt/ibmhpc/ppe.poe/samples/mpi/mcpgath.o
mpcc -o mcpscat /opt/ibmhpc/ppe.poe/samples/mpi/mcpscat.o
3. Change ownership and group of these utilities to bin,bin.
4. Link the utilities from /opt/ibmhpc/ppe.poe/bin to /usr/bin.
For more information on the supported version of the C compilers, refer to
Chapter 2, “Planning to install the PE software,” on page 3.
Chapter 3. Installing the Parallel Environment 37
38 IBM PE for AIX and Linux V5 R1: Installation
Chapter 4. Migrating and upgrading PE
The PE migration information explains how to migrate from earlier releases of PE
for AIX and PE for Linux to PE 5.1. You may need to refer to the PE
documentation during migration.
If you are using PE for AIX, refer to “Migrating and upgrading PE for AIX.” If you
are using PE for Linux, refer to “Migrating and upgrading PE for Linux” on page
42.
Migrating and upgrading PE for AIX
These instructions explain how to migrate from earlier releases of PE for AIX to PE
5.1. There are differences between earlier releases that you need to consider before
installing or using PE 5.1.
When we refer to PE Version 5 or PE 5.1, we mean the latest version of PE, which
is PE 5.1.0, unless otherwise specified.
PE 5.1 is the latest available supported level of PE Version 5. To find out which
release of PE you currently have installed, issue the lslpp command.
PE for AIX migration overview
If you have an earlier release of PE already installed, installing the PE Version 5
filesets involves a migration installation on top of the earlier filesets. The earlier
filesets will be completely replaced, unnecessary files and directories will be
removed and rendered obsolete, and disk space conserved.
Because some existing files, for example the compiler utility scripts, may have been
modified, these files are saved before they are replaced. The files are saved in the
/usr/lpp/save.config/usr/lpp/ppe.poe/bin directory.
Several files are saved as part of the migration installation, in case those files were
previously modified. For specific details, refer to “How installing the POE fileset
alters your system” on page 53.
To the Object Data Manager (ODM) and lslpp, however, the earlier filesets will
show as installed but marked OBSOLETE. Also, some older directories and
installation-related files may remain.
Note that if you later attempt to remove an older fileset, files from the newer fileset
may be removed instead. To avoid this potential side effect, completely remove
older releases of the PE filesets before you begin installation. If your installation
currently has ppe.vt or ppe.pedb filesets installed, you should remove them
because PE Version 4 and PE Version 5 do not support them. Also, if your
installation currently has ppe.perf or ppe.pvt installed, you should remove them
because PE Version 5 does not support them. These filesets are not automatically
removed or marked obsolete by newer installations of PE, although they should no
longer be used with current versions of PE. For more details, see “Migration
installation” on page 18.
© Copyright IBM Corp. 1993, 2008 39
AIX compatibility
PE Version 5 commands and applications are compatible with AIX 5.3 and AIX 6.1
only. PE Version 5 commands and applications are not compatible with earlier
versions of AIX.
Coexistence
All nodes in a parallel job must be running the same versions of PE and
LoadLeveler, at the same maintenance levels.
When TWS LoadLeveler and PE coexist on a node, they must be one of the
following:
v TWS LoadLeveler 3.5 with PE Version 5.1 or later.
v TWS LoadLeveler 3.4 with PE Version 4.3 or later
v TWS LoadLeveler 3.4 with PE Version 4.2.2
It is recommended that both PE 5.1 and TWS LoadLeveler 3.5 be installed at their
latest support levels to provide the latest functional support. Some important
functions are not available in earlier versions. For example, in order to use the
TWS LoadLeveler scheduling affinity function with the InfiniBand interconnect,
you must have TWS LoadLeveler 3.4.3 (or later) installed.
PE does not support interoperability between nodes running AIX and Linux
versions of PE. Parallel jobs cannot be mixed between AIX and Linux PE nodes.
Beginning with Version 5.1, PE no longer supports PSSP. PE 4.2 was the last release
to support PSSP 3.5.
Migration support
PE does not support node-by-node migration. You must migrate all of the nodes in
a system partition or parallel cluster to a new level of PE at the same time.
In general, the preferred upgrade path for PE is to upgrade the AIX level and then
the PE level. There are a number of migration paths available:
1. AIX 5.2 PSSP 3.5, and PE 4.2 to AIX V5.3 TL 5300-05 and PE 5.1
2. AIX 5.2 (no PSSP) and PE 4.2 to AIX V5.3 TL 5300-05 and PE 5.1
3. AIX 5.3 and PE 4.3 to AIX V5.3 TL 5300-05 and PE 5.1
4. AIX 5.3 and PE 4.3 to AIX V6.1 and PE 5.1
5. AIX 6.1 and PE 4.3 to AIX V6.1 and PE 5.1
AIX support
PE Version 5.1.0 supports:
v AIX Version 5.3 Technology Level 5300-05 (AIX V5.3 TL 5300-05). The IBM High
Performance Switch (HPS) is now supported on AIX V5.3 TL 5300-05.
v AIX Version 5.3 Technology Level 5300-06 (AIX V5.3 TL 5300-06), for use with
the InfiniBand host channel adapter.
v AIX V5.3 TL 5300-05 threaded profiling support. See IBM Parallel Environment:
Operation and Use for more information.
v AIX Version 6.1 (or later), either standalone or connected via an Ethernet LAN
supporting IP for the supported IBM Power Systems servers.
40 IBM PE for AIX and Linux V5 R1: Installation
Note that under AIX 5.3, PE Version 5.1.0 requires LAPI (rsct.lapi.rte) Version 2.4.6,
or later. Under AIX 6.1, PE Version 5.1.0 requires LAPI (rsct.lapi.rte) Version 3.1.2,
or later.
MPI library support
PE Version 5 provides support for its threaded version of the MPI library only. A
non-threaded (or signal based) library is also shipped, and its symbols are
exported from the threaded library, libmpi_r.a, for binary compatibility.
Binary compatibility is supported for existing applications that have been
dynamically linked or created with the non-threaded compiler scripts from
previous versions of POE. There is no binary compatibility for statically bound
executables.
Existing applications built as non-threaded applications will execute as single
threaded applications in the PE Version 5 environment. Users and application
developers should understand the implications of their programs running as
threaded applications, as described in the appropriate sections of the MPI
Programming Guide.
Barrier synchronization register (BSR) support
PE Version 5 allows the MPI library to access the barrier synchronization register
(BSR). The barrier synchronization register is a memory register that is located on
IBM Power (POWER6) servers. It performs barrier sychronization, which is a
method of synchronizing the threads in the parallel application.
The BSR is only available for 64-bit MPI applications, running on IBM Power
(POWER6) servers, over the AIX 6.1 operating system. The rsct.lapi.bsr fileset must
be installed, and you must enable the BSR support in order to use it. For
information about installing the rsct.lapi.bsr fileset, see the RSCT: LAPI
Programming Guide. For more information about enabling the BSR support, see
“Enabling the barrier sychronization register (BSR)” on page 28.
LAPI support
Beginning with PE 5.1, LAPI is shipped as a fileset on the PE product CD. As in
the previous release, this fileset is called rsct.lapi and contains three installation
images: rsct.lapi.rte, rsct.lapi.samp, and rsct.lapi.bsr.
Note: AIX 6.1 users can obtain the 3.1.2 version of the LAPI filesets from the PE
product CD. AIX 5.3 users you must obtain the LAPI filesets from the AIX
5.3 operating system packages. Also note that PE does not support the
barrier synchronization register on AIX 5.3. As a result, the rsct.lapi.bsr
fileset is only available for AIX 6.1 users.
MPI uses LAPI as a message transport protocol. MPI users will require LAPI to be
previously installed. Users and application developers may need to understand
this relationship, as described in the appropriate sections of the Parallel
Environment: MPI Programming Guide, and the RSCT: LAPI Programming Guide.
Fortran 90 compile time type-checking support
Beginning with Version 5.1, PE now provides a Fortran 90 module for
type-checking at compile time (mpi.mod) that can be called by a Fortran 90 MPI
program. A Fortran 90 module is a type of program unit that can provide
instructions for declaring and defining interfaces. By using a module to define the
Chapter 4. Migrating and upgrading PE 41
interface, a programmer can enforce the data types that a function can accept.
When using the module, any unsolicited MPI function call that does not conform
to the interface definition generates an error when the program is compiled. Using
the module allows errors to be identified and fixed at an earlier stage.
PE Version 5.1 requires LAPI 2.4.6 (or later) for AIX 5.3, and LAPI 3.1.2 (or later)
for AIX 6.1. PE 5.1 does not support earlier versions of LAPI.
To take advantage of the new type-checking module, users must add the statement
USE MPI to their Fortran 90 application source code. The XL Fortran compiler
Version 12.1 is required. For more details, see the appropriate sections of Operation
and Use and the MPI Programming Guide.
Online documentation
To access the most recent Parallel Environment documentation in PDF and HTML
format, refer to the IBM Cluster information center (http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp).
Both the current Parallel Environment books and earlier versions of the library are
also available in PDF format from the IBM Publications Center
(http://www.ibm.com/shop/publications/order/).
PE Version 5 continues to ship man pages, in the ppe.man fileset, which
completely replaces earlier versions of the man pages already installed. For more
information, see “How installing the online documentation alters your system” on
page 57.
Migrating and upgrading PE for Linux
These instructions explain how to migrate from earlier releases of PE for Linux to
PE 5.1. There are differences between earlier releases that you need to consider
before installing or using PE 5.1.
When we refer to PE Version 5 or PE 5.1, we mean the latest version of PE, which
is PE 5.1, unless otherwise specified.
PE Version PE 5.1 is the latest available supported level of PE Version 5.
PE for Linux migration overview
If you have an earlier release of PE already installed, installing the PE Version 5
RPMs involves a migration installation on top of the earlier RPMs. The earlier
RPMs will be completely replaced, unnecessary files and directories will be removed
and rendered obsolete, and disk space conserved.
Because some existing files, for example the compiler utility scripts, may have been
modified, these files are saved before they are replaced. The files are saved in the
/opt/ibmhpc/ppe.poe/save_file directory.
Several files are saved as part of the migration installation, in case those files were
previously modified. For specific details, refer to “How installing the PE and LAPI
RPMs alters your system” on page 59.
The basic PE migration procedure is as follows:
1. Mount the new PE CD or extract all the new LAPI and PE RPMs and files from
the tar file that is provided for the platform you are using.
42 IBM PE for AIX and Linux V5 R1: Installation
2. Replace the current PE license RPM with the PE license RPM for the new level.
The RPMs for PE 5.1 are as follows:
v For IBM Power Systems servers, IBM BladeCenter servers, and IBM System x
servers: rpm -U IBM_pe_license-5.1.0.0-BuildLevel.ppc.rpm
v For System x: rpm -U IBM_pe_license-5.1.0.0-0838a.i386.rpm
3. Replace the remaining LAPI and PE components. You can do this in a single
step by running the installation script with the -install_op U flag. For example:
pe_install.sh -install_op U
Note that you must either run the installation script in the same directory in
which the all the RPMs are located, or you must specify the directory with the
dir flag.
There are two types of upgrades:
v “Installing PTF upgrades”
v “Installing fix upgrades”
Installing an upgrade
The installation upgrade procedure is similar to the new installation procedure
described in “Installing the PE and LAPI RPMs manually” on page 30. There are
two types of upgrade installations: a PTF upgrade and a fix upgrade.
Installing PTF upgrades
You need to install a PTF upgrade after downloading a PTF release. With a PTF
release, you get a full set of Parallel Environment RPMs, excluding the license
RPM. The procedure for installing a PTF is almost identical to installing a new
release. You can use the pe_install.sh installation script in either interactive mode
or in batch mode. You may also invoke the rpm -U command directly.
If you are using the installation script interactively, enter ./pe_install.sh -a.
You see these messages and prompt:
The IBM PE license RPM has already been installed.
Preparing to install/upgrade the IBM Parallel Environment product RPMs.
Do you want to perform new RPM installations, update existing RPMs, or install
a fix to specific RPMs?
Enter ’i’ for new installs (default), ’U’ for update installs, or
’f’ for fix installs:
To install the PTF, enter U. The remainder of the process is exactly the same as
performing a new installation. When the installation is complete, the previous
version of the RPMs are replaced by a newer version.
You can also install a PTF upgrade by running the installation script in batch mode
by entering ./pe_install.sh -install_op U. The end result is the same. Using this
script in conjunction with rsh, ssh, or dsh, allows you to install the PTF across a
cluster of nodes.
Installing fix upgrades
If you have a problem with the Parallel Environment and have received, for
example, a new PE 64-bit RPM to fix the problem, you can install the fix manually
or by using the installation script.
Chapter 4. Migrating and upgrading PE 43
To install a fix manually, use the rpm command. For example:
rpm -U PE_or_LAPI_rpm_name.rpm
For a complete list of RPM names, see “PE RPMs required for Linux installation”
on page 7.
You can also install the fix interactively using the installation script. The advantage
of using the installation script is that if the fix involves more than one RPM from a
component, then all RPMs are installed with a single call.
To install a fix interactively, enter ./pe_install.sh -a.
You see the following messages and prompt:
The IBM PE license RPM has already been installed.
Preparing to install/upgrade the IBM Parallel Environment product RPMs.
Do you want to perform new RPM installations, update existing RPMs, or install
a fix to specific RPMs?
Enter ’i’ for new installs (default), ’U’ for update installs, or
’f’ for fix installs:
After you have entered f to install the fix, you are prompted again for the
appropriate component:
Please specify the component the fix is intended
for, by entering either ’pe’ or ’lapi’ (no default):
Enter pe. In response, you see one more prompt:
Please specify the VRMF-level string from the name
of the file containing the fix (Example: 5.1.0.0-0610a):
Enter the version and build level of the fix. For example:
5.1.0.0-0625b
The rest is the same as described above.
You can also install the fix using the script in batch mode (as opposed to
interactive mode). To do this, enter ./pe_install.sh and specify the version and
build level. For example:
./pe_install.sh -fix_level pe-5.1.0.0-0626b
The disadvantage of using the installation script is that it can only install fixes
from one component at a time. If you have received fixes for both the pe and the
lapi components, you must run the script twice.
Coexistence
When TWS LoadLeveler and PE coexist on a node, they must be one of the
following:
v TWS LoadLeveler 3.5 with PE Version 5.1 or later
v TWS LoadLeveler 3.4 with PE Version 4.3 or later
v TWS LoadLeveler 3.4 with PE Version 4.2.2
All nodes in a parallel job must be running the same versions of PE and TWS
LoadLeveler, at the same maintenance levels. It is recommended that you install
both PE 5.1 and TWS LoadLeveler 3.5 at their latest support levels to provide the
latest functional support.
44 IBM PE for AIX and Linux V5 R1: Installation
PE does not support interoperability between nodes running AIX and Linux
versions of PE. Parallel jobs cannot be mixed between AIX and Linux PE nodes.
Migration support
PE does not support node-by-node migration. You must migrate all of the nodes in
a system partition or parallel cluster to a new level of PE at the same time.
In general, the preferred upgrade path for PE is to upgrade the Linux distribution
level and then the PE level. There are a number of migration paths available:
1. SUSE 9 and PE 4.2 to SUSE 10 and PE 5.1
2. Red Hat 4 and PE 4.2 to Red Hat 5 and PE 5.1
3. SUSE 9 and PE 4.3 to SUSE 9 and PE 5.1
4. SUSE 9 and PE 4.3 to SUSE 10 and PE 5.1
5. Red Hat 4 and PE 4.3 to Red Hat 4 and PE 5.1
6. Red Hat 4 and PE 4.3 to Red Hat 5 and PE 5.1
LAPI support
MPI uses LAPI as a message transport protocol. MPI users will require LAPI to be
previously installed. Users and application developers may need to understand
this relationship, as described in the appropriate sections of the Parallel
Environment: MPI Programming Guide, and the RSCT: LAPI Programming Guide.
Fortran 90 compile time type-checking support
Beginning with Version 5.1, PE now provides a Fortran 90 module for
type-checking at compile time (mpi.mod) that can be called by a Fortran 90 MPI
program. A Fortran 90 module is a type of program unit that can provide
instructions for declaring and defining interfaces. By using a module to define the
interface, a programmer can enforce the data types that a function can accept.
When using the module, any unsolicited MPI function call that does not conform
to the interface definition generates an error when the program is compiled. Using
the module allows errors to be identified and fixed at an earlier stage.
To take advantage of the new type-checking module, users must add the statement
USE MPI to their Fortran 90 application source code. The XL Fortran compiler
Version 12.1 is required. For more details, see the appropriate sections of IBM
Parallel Environment: Operation and Use and IBM Parallel Environment: MPI
Programming Guide.
Chapter 4. Migrating and upgrading PE 45
46 IBM PE for AIX and Linux V5 R1: Installation
Chapter 5. Performing installation-related tasks
After you have finished installing PE, there are a number of tasks that you may
need to perform from time to time. These tasks vary between PE for AIX and PE
for Linux.
If you are using PE for AIX, refer to “Performing PE for AIX installation-related
tasks.” If you are using PE for Linux, refer to “Performing PE for Linux
installation-related tasks” on page 49.
Performing PE for AIX installation-related tasks
After you have finished installing PE, there are a number of tasks that you may
need to perform from time to time that are related to the original installation
procedure. These tasks include removing a software component and customizing
the message catalog.
The original installation procedure can be found in “Installing the PE for AIX
software” on page 17.
Removing a software component
During the installation process, you may decide to remove a PE software
component from the system. If you have already installed it on a number of nodes,
you can use the PEdeinstall script provided with PE, to do the removals.
For detailed information about this script and instructions describing how to run it,
see “PE for AIX deinstallation script: PEdeinstall” on page 76.
Recovering from a software vital product database error
If you install PE frequently, you may encounter an error such as:
0503-283 : Error in the Software Vital Product Data. The "usr"
part of a product does not have the same requisite file
as the "root" part. The product is: ppe.poe 5.1
This usually means that there is an incompatibility in the Object Data Manager
(ODM). This could be as a result of installing a version of a product where
prerequisites may have changed.
You need to remove the entries for a product from ODM. The following set of
commands removes the entries for POE (the ppe.poe fileset). To remove entries for
a different fileset, replace ppe.poe in the following commands with the appropriate
fileset name.
ODMDIR=/usr/lib/objrepos odmdelete -o product -qlpp_name=ppe.poe
ODMDIR=/usr/lib/objrepos odmdelete -o lpp -qname=ppe.poe
ODMDIR=/etc/objrepos odmdelete -o product -qlpp_name=ppe.poe
ODMDIR=/etc/objrepos odmdelete -o lpp -qname=ppe.poe
© Copyright IBM Corp. 1993, 2008 47
Customizing the message catalog
All PE filesets or RPMs use message cataloging so that messages can appear in
languages other than English. Each fileset or RPM has message catalogs installed
in a directory located by the NLSPATH environment variable. The message
catalogs are installed in three common English language paths and are in the
format of component.cat. The paths are:
For AIX:
v /usr/lib/nls/msg/C
v /usr/lib/nls/msg/En_US
v /usr/lib/nls/msg/en_US
For Linux:
v /usr/share/locale/en_US/pempl.cat
v /usr/share/locale/en_US/pepoe.cat
v /usr/share/locale/ en_US/liblapi.cat
v /usr/share/locale/ en_US/UTF-8.cat
1. Before verifying the installation for POE, you should set the LANG
environment variable to C.
2. If the message catalogs are installed in a directory other than C, modify
/etc/environment to set the NLSPATH to the appropriate directory. You also
need to set the user’s LANG environment variable.
Installing AFS
These are the instructions for tailoring the parallel operating environment for
execution with the AFS file system. The source files settokens.c and gettokens.c
are intended to be used with Transarc’s Kerberos Authentication program, but
should be usable as a guide for other environments.
The files needed for setting up the AFS execution are in the /usr/lpp/ppe.poe/samples/afs directory. They are:
README.afs
Readme file that contains much of the same information contained here.
gettokens.c
Subroutine to get an AFS token on the node where the user is logged on
(or already authenticated)
settokens.c
Subroutine to put an AFS token on the remote node that is running the
user’s executable
makefile
Makefile for creating object modules from settokens.c and gettokens.c
buildAFS
Sample shell script for replacing the routines settoken and gettokens
distributed with POE by the routines built by the makefile
Setting up POE for AFS execution
Perform the following procedure as root for setting up POE for AFS execution:
48 IBM PE for AIX and Linux V5 R1: Installation
ENTER
cd /usr/lpp/ppe.poe/samples/afs to switch to the appropriate directory or
copy the contents of the directory to a convenient location.
ENTER
the make command to create the files settokens.o and gettokens.o from
gettokens.c and settokens.c. If you are not using the Transarc system, you
may need to alter these routines to provide the desired token access. The
calling sequence of the parameters cannot be changed.
VERIFY
that the partition manager daemon, pmdv5, the home node partition
manager, and poe are in /usr/lpp/ppe.poe/bin. If not, modify the buildAFS
script.
Before completing the following step, ensure that you have the following
amounts of available space in the current directory, as shown in Table 19:
Table 19. Space requirements for the partition manager daemon and poe components
Components being built Total available space required (in
megabytes)
pmdv5, poe 2
ENTER
buildAFS to create new versions of pmdv5 and poe in the current
directory. If the linking step fails, locate the libraries containing the
modules that were not found, and alter the library search list in buildAFS
to include them.
MOVE
pmdv5 and poe to their usual location in /usr/lpp/ppe.poe/bin on each
node. You can rename the old versions in case they need to be restored.
Make sure that they are made executable.
You should not have to modify your program executables. You can now
pass AFS authorization across the partition.
The .rhosts file in the user’s home directory must include the nodes that
are intended for Parallel Operating Environment use. This ensures that the
proper access is permitted.
Performing PE for Linux installation-related tasks
After you have finished installing PE, there are a number of tasks that you may
need to perform from time to time that are related to the original installation
procedure. These tasks include removing a software component and customizing
the message catalog.
The original installation procedure can be found in “Installing the PE for Linux
software” on page 28.
Finding installed components
To determine which of the LAPI or PE product RPMs, or which PE license RPM is
installed, you can use the rpm -qa command combined with the grep command.
In the following examples, the 32-bit and 64-bit RPMs, for both LAPI and PE, have
been installed, as well as the IP and US RPMs for LAPI.
Chapter 5. Performing installation-related tasks 49
To determine which LAPI RPMs have been installed, enter the following:
rpm -qa | grep lapi
You should see output similar to this:
lapi_x86_32bit_base_IP_rh500-3.1.2.0-0611a
lapi_x86_64bit_IP_rh500-3.1.2.0-0611a
lapi_x86_32bit_US_rh500-3.1.2.0-0611a
lapi_x86_64bit_US_rh500-3.1.2.0-0611a
To determine which PE product RPMs have been installed, enter the following:
rpm -qa | grep ppe
You should see output similar to this:
ppe_x86_base_32bit_rh500-5.1.0.0-0611a
ppe_x86_64bit_rh500-5.1.0.0-0611a
To determine which PE license RPM has been installed, enter the following:
rpm -qa | grep IBM_
You should see output similar to this:
IBM_pe_license-5.1.0.0-0611a
The poe command, which is included in the 32-bit PE base RPM, will generate a
list of all installed LAPI and PE product RPMs, when issued with the -v flag. In
the scenario given above, you should see output similar to the following when poe
-v is issued:
lapi_x86_32bit_base_IP_rh500-3.1.2.0-0611a
lapi_x86_64bit_IP_rh500-3.1.2.0-0611a
lapi_x86_32bit_US_rh500-3.1.2.0-0611a
lapi_x86_64bit_US_rh500-3.1.2.0-0611a
ppe_x86_base_32bit_rh500-5.1.0.0-0611a
ppe_x86_64bit_rh500-5.1.0.0-0611a
The perpms command, which is also supplied in the 32-bit PE base RPM, returns a
list of all installed LAPI and PE product RPMs, as well as the installed RPMs for
IBM products upon which LAPI and PE rely for certain functions. For more
information on the perpms command, see IBM Parallel Environment: Operation and
Use.
Removing a software component
You can remove any of the PE or LAPI RPMs manually, one RPM at a time, using
the rpm -e command with the name of the RPM you wish to remove. However,
because many of the PE and LAPI components depend on each other, you cannot
randomly delete any of these RPMs; they must be removed in the reverse order in
which they were installed. The RPMs must be deleted in the following order:
1. 64-bit PE RPM (if applicable)
2. 32-bit PE base RPM
3. PE License RPM
4. 64-bit LAPI US RPM (if applicable)
5. 32-bit LAPI US RPM (if applicable)
6. 64-bit LAPI IP RPM (if applicable)
7. 32-bit LAPI base IP RPM
50 IBM PE for AIX and Linux V5 R1: Installation
For a complete list of RPMs, listed according to the associated hardware platform,
see “PE RPMs required for Linux installation” on page 7.
A shell script, pe_deinstall.sh, has been provided for removing all installed PE and
LAPI RPMs, including the PE license RPM. This script is provided with the PE
license RPM and is located in the /opt/ibmhpc/install/bin directory.
As a result of removing all PE and LAPI RPMs, there are various tasks that are
performed during the removal process for a particular RPM that restore the system
to the state it was in before the RPM was installed. Among these tasks are:
v Removes PE related entries in the /etc/services file
v Deletes the Partition Manager Daemon, and restarts the xinetd daemon
v Removes all symbolic links from /usr/bin to PE executables
v Removes all symbolic links from /usr/lib and /usr/lib64 to PE or LAPI libraries,
then executes ldconfig to refresh the system library data base
v Removes all symbolic links to miscellaneous PE or LAPI files
v Deletes all PE license files
For details on the changes made to your system as a result of installing PE, LAPI,
or PE license RPMs, see Chapter 6, “Understanding how installing PE alters your
system,” on page 53.
Customizing the message catalog
All PE filesets or RPMs use message cataloging so that messages can appear in
languages other than English. Each fileset or RPM has message catalogs installed
in a directory located by the NLSPATH environment variable. The message
catalogs are installed in three common English language paths and are in the
format of component.cat. The paths are:
For AIX:
v /usr/lib/nls/msg/C
v /usr/lib/nls/msg/En_US
v /usr/lib/nls/msg/en_US
For Linux:
v /usr/share/locale/en_US/pempl.cat
v /usr/share/locale/en_US/pepoe.cat
v /usr/share/locale/ en_US/liblapi.cat
v /usr/share/locale/ en_US/UTF-8.cat
1. Before verifying the installation for POE, you should set the LANG
environment variable to C.
2. If the message catalogs are installed in a directory other than C, modify
/etc/environment to set the NLSPATH to the appropriate directory. You also
need to set the user’s LANG environment variable.
Chapter 5. Performing installation-related tasks 51
52 IBM PE for AIX and Linux V5 R1: Installation
Chapter 6. Understanding how installing PE alters your
system
Your system is altered when you install the various components of PE, and these
changes are different, depending on whether you are using PE for AIX or PE for
Linux.
If you are using PE for AIX, refer to “Understanding how installing PE for AIX
alters your system.” If you are using PE for Linux, refer to “Understanding how
installing PE for Linux alters your system” on page 58.
Understanding how installing PE for AIX alters your system
When you install the various PE software filesets, your system is altered.
Directories and files are created, the daemon processes are created, and links are
established by the installation process.
How installing the POE fileset alters your system
The ppe.poe fileset includes all of the components of the parallel operating
environment (POE), and consists of:
v API subroutine libraries (message passing and collective communication)
v Parallel compilation scripts
v The parallel utility library
v The partition manager
v The pdb debugger
Installing this fileset, as described in “Step 2: Perform the initial installation” on
page 21, does the following:
1. Creates the directories and files shown in Table 20:
Table 20. POE directories and files installed
Directory or file Description
/usr/lib/nls/msg/en_US/pempl.cat
/usr/lib/nls/msg/En_US/pempl.cat
/usr/lib/nls/msg/C/pempl.cat
Message Catalog for Message Passing Library
/usr/lib/nls/msg/en_US/pepoe.cat
/usr/lib/nls/msg/En_US/pepoe.cat
/usr/lib/nls/msg/C/pepoe.cat
Message catalog for POE
/usr/lpp/ppe.poe/bin/mpamddir Shell script for echoing an AMD mountable directory
name
/usr/lpp/ppe.poe/bin/mcp Executable for multiple file copy utility
/usr/lpp/ppe.poe/bin/mcpgath Executable for parallel file copy gather utility
/usr/lpp/ppe.poe/bin/mcpscat Executable for parallel file copy scatter utility
/usr/lpp/ppe.poe/bin/mpcc_r Shell script for compiling threaded parallel C programs
/usr/lpp/ppe.poe/bin/mpCC_r Shell script for compiling threaded parallel C++ programs
© Copyright IBM Corp. 1993, 2008 53
Table 20. POE directories and files installed (continued)
Directory or file Description
/usr/lpp/ppe.poe/bin/mpiexec Portable MPI startup script
/usr/lpp/ppe.poe/bin/mpxlf_r Shell script for compiling threaded parallel Fortran
programs
/usr/lpp/ppe.poe/bin/mpxlf90_r Shell script for compiling threaded parallel Fortran 90
programs
/usr/lpp/ppe.poe/bin/mpxlf95_r Shell script for compiling threaded parallel Fortran 95
programs
/usr/lpp/ppe.poe/bin/PEdeinstall Shell script to remove an installation of PE on IBM Power
Systems nodes
/usr/lpp/ppe.poe/bin/PEinstall Shell script to complete the installation process on IBM
Power Systems nodes
/usr/lpp/ppe.poe/bin/pmadjpri Dispatching priority adjustment coscheduler daemon.
This is shipped as a set-user-identity-on-execution binary
file to allow any system user to use this utility.
/usr/lpp/ppe.poe/bin/pmdv5 A daemon process that runs on each of your processor
nodes.
/usr/lpp/ppe.poe/bin/poe Partition manager executable
/usr/lpp/ppe.poe/bin/poeckpt Executable for checkpointing interactive POE applications
/usr/lpp/ppe.poe/bin/poerestart Executable for restarting POE applications
/usr/lpp/ppe.poe/bin/poekill Shell script for terminating all POE started tasks
/usr/lpp/ppe.poe/bin/pm_set_affinity Executable for task affinity assignment. This is shipped as
set-user-identity-on-execution binary file to allow any
system user to use this utility.
/usr/lpp/ppe.poe/bin/rset_query Executable for displaying affinity resources
/usr/lpp/ppe.poe/bin/ppe_ke_load Kernel extension load routine for standalone POE task
affinity with OpenMP
/usr/lpp/ppe.poe/include Directory of header files containing declarations used by
other installed files
/usr/lpp/ppe.poe/include/pm_ckpt.h Header for compiling programs with Checkpoint and
Restart capability
/usr/lpp/ppe.poe/include/thread/mpi.mod MPI Fortran module support (USE MPI)
/usr/lpp/ppe.poe/include/thread64/mpi.mod MPI Fortran 64–bit module support (USE MPI)
/usr/lpp/ppe.poe/include/thread64/mpif.h Header for compiling 64–bit threaded MPI Fortran
applications
/usr/lpp/ppe.poe/include/xlfmod32/mpi.mod MPI Fortran 90 32–bit type-checking module
/usr/lpp/ppe.poe/include/xlfmod32/fmodname32.o MPI Fortran 90 32–bit object file
/usr/lpp/ppe.poe/include/xlfmod64/mpi.mod MPI Fortran 90 64–bit type-checking module
/usr/lpp/ppe.poe/include/xlfmod64/fmodname64.o MPI Fortran 90 64–bit object file
/usr/lpp/ppe.poe/lib/libmpi.a Archive library containing subroutines for parallel
message-passing programs
/usr/lpp/ppe.poe/lib/libmpi_r.a Archive library containing subroutines for parallel
message-passing programs in a threads environment
/usr/lpp/ppe.poe/lib/libppe.a
/usr/lpp/ppe.poe/lib/libppe_r.a
Archive library containing subroutines for POE
54 IBM PE for AIX and Linux V5 R1: Installation
Table 20. POE directories and files installed (continued)
Directory or file Description
/usr/lpp/ppe.poe/lib Directory containing shared libraries and objects used by
POE and MPI programming interfaces.
/usr/lpp/ppe.poe/lib/libpoeapi.a Archive library containing subroutines for the POE API
/usr/lpp/ppe.poe/lib/hpc_cpuidmap_ke Kernel extension for standalone POE task affinity with
OpenMP
/usr/lpp/ppe.poe/READMES/poe.README Memo to users relating to this release
/usr/lpp/ppe.poe/samples Directory containing sample programs for the program
marker array and other samples
/usr/lpp/ppe.poe/include/poeapi.h Header file for the POE API
/usr/lpp/ppe.poe/include/thread/mpif.h Header file for compiling threaded MPI Fortran
applications
/usr/lpp/ppe.poe/samples/scripts/poewhere Script for displaying the stack trace for each thread of a
program
/usr/lpp/ppe.poe/samples/swtbl Directory containing sample code for running User Space
POE jobs without LoadLeveler
/usr/lpp/ppe.poe/samples/ntbl Directory containing sample code for running user space
jobs without LoadLeveler, using the network table API
/usr/lpp/ppe.poe/samples/nrt Directory that contains the sample code for running User
Space jobs on InfiniBand interconnects, without
LoadLeveler, using the network resource table API. See
“Configuring InfiniBand for User Space without
LoadLeveler (PE for AIX only)” on page 72 for more
information.
/etc/poe.security Security method configuration file
2. When the installp command successfully restores POE’s files from the
distribution medium, the command looks at the ppe.poe.post_i file for
post–installation steps. As part of these post–installation steps, ppe.poe.post_i
sets up the symbolic links, as shown in Table 21:
Table 21. ppe.poe.post_i symbolic links
This link: To:
/etc/pmdv5 /usr/etc/pmdv5
/usr/bin/mpcc /usr/lpp/ppe.poe/bin/mpcc_r
/usr/bin/mpcc_r /usr/lpp/ppe.poe/bin/mpcc_r
/usr/bin/mpCC /usr/lpp/ppe.poe/bin/mpCC_r
/usr/bin/mpCC_r /usr/lpp/ppe.poe/bin/mpCC_r
/usr/bin/mpamddir /usr/lpp/ppe.poe/bin/mpamddir
/usr/bin/mpxlf /usr/lpp/ppe.poe/bin/mpxlf_r
/usr/bin/mpxlf_r /usr/lpp/ppe.poe/bin/mpxlf_r
/usr/bin/mpxlf90 /usr/lpp/ppe.poe/bin/mpxlf90_r
/usr/bin/mpxlf90_r /usr/lpp/ppe.poe/bin/mpxlf90_r
/usr/bin/mpxlf95 /usr/lpp/ppe.poe/bin/mpxlf95_r
/usr/bin/mpxlf95_r /usr/lpp/ppe.poe/bin/mpxlf95_r
/usr/bin/mcp /usr/lpp/ppe.poe/bin/mcp
Chapter 6. Understanding how installing PE alters your system 55
Table 21. ppe.poe.post_i symbolic links (continued)
This link: To:
/usr/bin/mcpgath /usr/lpp/ppe.poe/bin/mcpgath
/usr/bin/mcpscat /usr/lpp/ppe.poe/bin/mcpscat
/usr/bin/mpiexec /usr/lpp/ppe.poe/bin/mpiexec
/usr/bin/pmdadjpri /usr/lpp/ppe.poe/bin/pmdadjpri
/usr/bin/poe /usr/lpp/ppe.poe/bin/poe
/usr/bin/poeckpt /usr/lpp/ppe.poe/bin/poeckpt
/usr/bin/poekill /usr/lpp/ppe.poe/bin/poekill
/usr/bin/poerestart /usr/lpp/ppe.poe/bin/poerestart
/usr/bin/rset_query /usr/lpp/ppe.poe/bin/rset_query
/usr/etc/pmdv5 /usr/lpp/ppe.poe/bin/pmdv5
/etc/pm_set_affinity /usr/lpp/ppe.poe/bin/pm_set_affinity
/usr/sbin/PEdeinstall /usr/lpp/ppe.poe/bin/PEdeinstall
/usr/sbin/PEinstall /usr/lpp/ppe.poe/bin/PEinstall
/usr/lib/hpc_cpuidmap_ke /usr/lpp/ppe.poe/lib/hpc_cpuidmap_ke
3. During installation, if an existing version of ppe.poe is installed, the following
files are saved during installation of the new version in the /usr/lpp/save.config
directory:
/etc/poe.security
/usr/lpp/ppe.poe/bin/mpamddir
/usr/lpp/ppe.poe/bin/mpcc
/usr/lpp/ppe.poe/bin/mpCC
/usr/lpp/ppe.poe/bin/mpcc_r
/usr/lpp/ppe.poe/bin/mpCC_r
/usr/lpp/ppe.poe/bin/mpxlf
/usr/lpp/ppe.poe/bin/mpxlf90
/usr/lpp/ppe.poe/bin/mpxlf95
/usr/lpp/ppe.poe/bin/mpxlf_r
/usr/lpp/ppe.poe/bin/mpxlf90_r
/usr/lpp/ppe.poe/bin/mpxlf95_r
/usr/lpp/ppe.poe/lib/poe.cfg
/usr/lpp/ppe.poe/bin/makelibc
If these files were previously modified, the older versions are preserved in the
/usr/lpp/save.config directory and the new versions will need to be updated.
POE installation effects
Also, as part of the post-installation steps, the following changes occur.
Note: For systems that use the InfiniBand switch, the Partition Manager daemon
inetd service is called pmv5.
1. The file /etc/services is modified in the following manner:
v If no entry for the pmv5 service is found, an entry is added using port
6128/tcp.
v If an entry exists for the pmv5 service that uses port 6128/tcp, no change is
made to the /etc/services file.
v If one of the following is true:
a. A pmv5 entry exists for a port other than 6128/tcp
b. A 6128/tcp entry exists for a service other than pmv5
56 IBM PE for AIX and Linux V5 R1: Installation
you receive a warning and is instructed to correct the problem before
running POE. If you receive this warning, you must manually update the
/etc/services file to ensure that the port number for the pmv5 service is the
same on all machines that could run POE Version 5.2. The /etc/inetd.conf file is modified.
An entry for the pmv5 service that spawns the /etc/pmdv5 daemon is created if
no pmv5 entry exists.
3. inetd is refreshed.
4. With PE for AIX, if a symbolic link for /usr/etc/digd to /usr/lpp/ppe.vt/bin/digd
exists, but /usr/lpp/ppe.vt/bin/digd itself does not exist, the link is removed.
5. If /usr/lpp/x11/lib/x11/app-defaults/PMarray exists, it is removed, along with
the subdirectory if it is empty.
6. Executable versions of mcp, mcpgath, and mcpscat are created.
7. If /usr/lib/hpc_cpuidmap_ke does not exist, a symbolic link for it is made from
/usr/lpp/ppe.poe/lib/hpc_cpuidmap_ke.
8. The kernel extension for standalone POE affinity with OpenMP is loaded.
How installing PDB alters your system
The PDB interactive command line debugger is composed of the ppe.shell fileset.
Installing this fileset, as described in “Step 2: Perform the initial installation” on
page 21, creates directories and files that are shown in Table 22.
The ppe.shell fileset includes files that contain the interactive commands and
executables for launching and managing distributed process interactively, with the
Distributed Interactive Shell (DISH).
Table 22. PDB directories and files installed
Directory or file Description
/usr/lpp/ppe.shell/bin Directory that contains the commands and executables for
PDB.
/usr/lpp/ppe.shell/msg Directory that contains the message catalogs for PDB.
How installing the online documentation alters your system
The online documentation is composed of the following filesets:
v ppe.man: contains the PE man pages
These filesets completely replace the contents of the ppe.pedocs fileset that existed
in earlier versions of PE.
Installing these filesets “Step 2: Perform the initial installation” on page 21 creates
the directories and files detailed in Table 23 on page 58.
When you migrate from earlier versions of the ppe.pedocs fileset, the files
previously installed are removed. The fileset is changed to an OBSOLETE state in
the SWVPD and ODM.
The ppe.man fileset includes files that contain the PE man pages, as described in
Table 23 on page 58. Once you have installed the ppe.man fileset, you can find the
man pages in the appropriate path: /usr/man/cat1 or /usr/man/cat3.
Chapter 6. Understanding how installing PE alters your system 57
Table 23. Man page directories and files installed
Directory or file Description
/usr/man/cat1 Directory containing man page files for PE commands
/usr/man/cat3 Directory containing man page files for API
message-passing subroutines
/usr/lpp/ppe.man/READMES/ppe.man.README Installation readme file
Online documentation
To access the most recent Parallel Environment documentation in PDF and HTML
format, refer to the IBM Cluster information center (http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/index.jsp).
Both the current Parallel Environment books and earlier versions of the library are
also available in PDF format from the IBM Publications Center
(http://www.ibm.com/shop/publications/order/).
Understanding how installing PE for Linux alters your system
When you install the various PE software RPMS, your system is altered.
Directories and files are created, the daemon processes are created, and links are
established by the installation process.
How installing the PE license RPM alters your system
The PE license RPM contains the license acceptance script. It also contains a set of
electronic license files, one file for each supported language. These license files are
needed for installing the PE RPMs.
Table 24 lists the files and directories that are either created or installed as a result
of installing the PE license RPM:
Table 24. Directories and files associated with the PE license RPM
Directory or file Description
/opt/ibmhpc Directory that is created if the LAPI 32-bit
base IP RPM is not already installed
/opt/ibmhpc/install Directory that contains software installation
and deinstallation scripts, and the license
acceptance script
/opt/ibmhpc/install/bin/pe_install.sh PE installation script
/opt/ibmhpc/install/bin/pe_deinstall.sh PE deinstallation script
/opt/ibmhpc/install/sbin/accept_ppe_license.sh
License acceptance script
/opt/ibmhpc/ppe.poe/IBM_pe_product.SYS2 IBM PE license signature file
Table 25 lists the files and directories that are either created or installed as a result
of accepting the PE license agreement:
Table 25. Directories and files installed as a result of accepting the license agreement
Directory or file Description
/etc/opt/ibmhpc/license Directory that contains electronic licenses.
58 IBM PE for AIX and Linux V5 R1: Installation
PE license RPM installation effects
During the PE license installation process, the following actions occur:
1. The appropriate language license agreement file is installed, using the JAVA
runtime IBM License Acceptance tool set.
2. You are given the opportunity to review and accept the license agreement.
3. The PE installation and deinstallation scripts are installed, and symbolic links
to these scripts are also created.
4. The IBM JAVA runtime code is cleaned up and removed.
How installing the PE and LAPI RPMs alters your system
The PE RPMs contain the executables, libraries, and scripts for POE, MPI, their
associated man pages, and various other files.
Table 26 lists the files and directories that are either created or installed as a result
of installing the PE RPMs:
Table 26. Directories and files associated with the PE RPMs
Directory or file Description
/opt/ibmhpc/ppe.poe/READMES/poe.README Installation readme file
/opt/ibmhpc/ppe.poe/bin Directory containing PE scripts and executables
/opt/ibmhpc/ppe.poe/include Directory containing header files
/opt/ibmhpc/ppe.poe/include/thread Directory containing threaded header files
/opt/ibmhpc/ppe.poe/include/xlfmod32/mpi.mod MPI Fortran 90 32–bit type-checking module
/opt/ibmhpc/ppe.poe/include/xlfmod32/fmodname32.o MPI Fortran 90 32–bit object file
/opt/ibmhpc/ppe.poe/include/xlfmod64/mpi.mod MPI Fortran 90 64–bit type-checking module
/opt/ibmhpc/ppe.poe/include/xlfmod64/fmodname64.o MPI Fortran 90 64–bit object file
/opt/ibmhpc/ppe.poe/lib/libmpi Directory containing 32-bit shared libraries
/opt/ibmhpc/ppe.poe/lib/libmpi64 Directory containing 64-bit shared libraries
/opt/ibmhpc/ppe.poe/msg/en_US Directory containing PE message catalogs
/opt/ibmhpc/ppe.poe/samples Directory containing sample programs
/opt/ibmhpc/ppe.poe/man/cat1 Directory containing PE command man pages
/opt/ibmhpc/ppe.poe/man/cat3 Directory containing PE subroutine man pages
/etc/poe.security Security configuration file
/etc/xinetd.d/pmv5 Partition Manager daemon inetd service
The LAPI RPMs contain the libraries, scripts, sample programs, messages and
various other files associated with LAPI.
Table 27 lists the files and directories that are either created or installed as a result
of installing the LAPI RPMs:
Table 27. Directories and files associated with the LAPI RPMs
Directory or file Description
/opt/ibmhpc/lapi/include Directory containing 32-bit header files
/opt/ibmhpc/lapi/include64 Directory containing 64-bit header files
/opt/ibmhpc/lapi/lib Directory containing libraries
/opt/ibmhpc/lapi/msg Directory containing LAPI message catalogs
Chapter 6. Understanding how installing PE alters your system 59
Table 27. Directories and files associated with the LAPI RPMs (continued)
Directory or file Description
/opt/ibmhpc/lapi/samples Directory containing sample programs
Table 28 lists the symbolic links and files that are created when installing PE:
Table 28. Symbolic links created during PE RPM installation
This link: To:
/usr/bin/pe_install.sh /opt/ibmhpc/install/bin/pe_install.sh
/usr/bin/pe_deinstall.sh /opt/ibmhpc/install/bin/pe_deinstall.sh
/usr/bin/mpcc /opt/ibmhpc/ppe.poe/bin/mpcc
/usr/bin/mpCC /opt/ibmhpc/ppe.poe/bin/mpCC
/usr/bin/mpfort /opt/ibmhpc/ppe.poe/bin/mpfort
/usr/bin/perpms /opt/ibmhpc/ppe.poe/bin/perpms
/usr/bin/poe /opt/ibmhpc/ppe.poe/bin/poe
/usr/bin/poekill /opt/ibmhpc/ppe.poe/bin/poekill
/usr/bin/mpiexec /opt/ibmhpc/ppe.poe/bin/mpiexec (if it does not already
exist)
/usr/bin/mcp /opt/ibmhpc/ppe.poe/bin/mcp (created during
installation)
/usr/bin/mcpgath /opt/ibmhpc/ppe.poe/bin/mcpgath (created during
installation)
/usr/bin/mcpscat /opt/ibmhpc/ppe.poe/bin/mcpscat (created during
installation)
/usr/bin/cpuset_query /opt/ibmhpc/ppe.poe/bin/cpuset_query (created during
installation)
/etc/pmdv5 /opt/ibmhpc/ppe.poe/bin/pmdv5
/usr/share/man/cat1/* /opt/ibmhpc/ppe.poe/man/cat1/*
/usr/share/man/cat3/* /opt/ibmhpc/ppe.poe/man/cat3/*
/usr/share/locale/C/pempl.cat /opt/ibmhpc/ppe.poe/msg/en_US/pempl.cat
/usr/share/locale/C/pepoe.cat /opt/ibmhpc/ppe.poe/msg/en_US/pepoe.cat
/usr/share/locale/En_US/pempl.cat /opt/ibmhpc/ppe.poe/msg/en_US/pempl.cat
/usr/share/locale/en_US/pepoe.cat /opt/ibmhpc/ppe.poe/msg/en_US/pepoe.cat
/usr/share/locale/en_US.UTF-8/pempl.cat /opt/ibmhpc/ppe.poe/msg/en_US/pempl.cat
/usr/share/locale/en_US.UTF-8/pepoe.cat /opt/ibmhpc/ppe.poe/msg/en_US/pepoe.cat
/usr/lib/libmpi_ibm.so /opt/ibmhpc/ppe.poe/lib/libmpi/libmpi_ibm.so
/usr/lib/libpoe.so /opt/ibmhpc/ppe.poe/lib/libmpi/libpoe.so
/usr/lib64/libmpi_ibm.so /opt/ibmhpc/ppe.poe/lib/libmpi64/libmpi_ibm.so
/usr/lib64/libpoe.so /opt/ibmhpc/ppe.poe/lib/libmpi64/libpoe.so
Table 29 on page 61 lists the symbolic links and files that are created when
installing LAPI:
60 IBM PE for AIX and Linux V5 R1: Installation
Table 29. Symbolic links created during LAPI RPM installation
This link: To:
/usr/lib/liblapi.so /opt/ibmhpc/lapi/lib/lapi32/liblapi.so
/usr/lib/liblapiudp.so /opt/ibmhpc/lapi/lib/lapi32/liblapiudp.so
/usr/lib64/liblapi.so /opt/ibmhpc/lapi/lib/lapi64/liblapi.so
/usr/lib64/liblapiudp.so /opt/ibmhpc/lapi/lib/lapi64/liblapiudp.so
PE 32-bit base RPM installation effects
During the installation process, the following changes occur:
v If /usr/bin/mpiexec does not exist, a symbolic link is made to
/opt/ibmhpc/ppe.poe/bin/mpiexec.
v The PE man pages are linked to /usr/share/man/cat1 and /usr/share/man/cat3.
v The /etc/poe.security file is created.
v The /etc/services file is modified in the following manner:
1. If no entry for the pmv5 service is found, an entry is added using port 6128.
2. If an existing entry for pmv5 is found, a message is issued that the entry
already exists. In this case, the system administrator needs to manually
update the /etc/services file to ensure the port number for the pmv5 service
is the same on all nodes that could run PE Version 5.v If the /etc/xinetd.d/pmv5 file does not exist, it is created with the following
format:
socket_type = stream
wait = no
user = root
server = /etc/pmdv5
disable = no
v The daemon is restarted, in order to get the changes that were made to the
/etc/services file.
v The executable versions of the parallel file copy utilities (mcp, mcpgath, and
mcpscat) are created.
Chapter 6. Understanding how installing PE alters your system 61
62 IBM PE for AIX and Linux V5 R1: Installation
Chapter 7. Additional information for the system administrator
System administrators should familiarize themselves with the formats of the PE
files that they will create and edit (in /etc). These files are used for overriding
default environment variable values and choosing a security method. For PE for
AIX, they are also used for configuring the coscheduler and enabling RDMA.
Using the /etc/poe.limits file
An optional file named poe.limits can be created in the /etc directory, enabling the
system administrator to override the default values for certain POE environment
variables, and to limit the value set by a user. This is useful in cases where the
environment variable default values might cause problems on a particular node.
For example, if a node had only 64M of real memory, the default value of 64M for
MP_BUFFER_MEM would be too high. To correct this problem, the system
administrator would specify a lower value for MP_BUFFER_MEM in the
/etc/poe.limits file on that node. Note that if a value is set for MP_BUFFER_MEM,
it must be set to the same value in the /etc/poe.limits file on every node.
Entries in the /etc/poe.limits file
Entries in the /etc/poe.limits file must be in the form:
supported_object = value
where supported_object is currently limited to the following:
v MP_BUFFER_MEM
v MP_CC_BUF_MEM
v MP_USE_LL
For MP_BUFFER_MEM or MP_CC_BUF_MEM, you can provide a value in one of
two ways:
v Specify a single value to indicate the pool size for memory to be allocated at
MPI initialization time and dedicated to buffering of early arrivals.
v Specify two values. The first value indicates the pool size for memory to be
allocated at MPI initialization time (pre_allocated_size). The second value
indicates an upper bound of memory to be used if the pre-allocated pool is not
sufficient (maximum_size). Note that when you specify two values, you must
delineate them with a comma. Spaces before or after the comma are not allowed.
If you omit the first value (start the value string with a comma), the
pre_allocated_size will be set to the default (64 MB for MP_BUFFER_MEM or 4
MB for MP_CC_BUF_MEM).
Note also:
v If the value of MP_BUFFER_MEM or MP_CC_BUF_MEM is set in
/etc/poe.limits on one node, the same value must be specified as an entry in
/etc/poe.limits on all other nodes. If the nodes are set to different values, some
jobs may fail.
v If the preallocated size of MP_BUFFER_MEM or MP_CC_BUF_MEM is set to
less than the default (64 MB for MP_BUFFER_MEM, 4 MB for
© Copyright IBM Corp. 1993, 2008 63
MP_CC_BUF_MEM), this smaller value becomes the default. If the preallocated
value is set to larger than the default, this larger value becomes the limit to
which the MPI library sets the preallocated size (but the default remains 64 MB
or 4 MB).
For more information about specifying values for MP_BUFFER_MEM or
MP_CC_BUF_MEM, see IBM Parallel Environment: Operation and Use.
Note: Any line in the /etc/poe.limits file with the character # or ! in the first
column is treated as a comment.
How the Partition Manager daemon handles the /etc/poe.limits
file
If the /etc/poe.limits file has been set up on a particular node, the Partition
Manager daemon (pmdv5) on that node performs the following:
1. Compares the values specified in the /etc/poe.limits file against the
environment variables received from the home node
2. If necessary, resets the environment variables as follows:
MP_BUFFER_MEM
If the value in the environment exceeds the value specified in
/etc/poe.limits, pmdv5 resets the value to that specified in
/etc/poe.limits.
MP_USE_LL
If the value in the file is set to yes and POE determines that the job is
not being run under LoadLeveler, the job is terminated. Setting the
value to no has no effect.3. If a supported_object is specified in /etc/poe.limits but is not set in the
environment, sets the value to that specified in /etc/poe.limits
Note: If the /etc/poe.limits file contains entries with either unsupported objects to
the left of the equal sign or with invalid (nonnumeric for
MP_BUFFER_MEM and MP_CC_BUF_MEM) values to the right, the
Partition Manager daemon flags these entries in the pmdlog for that node.
The Partition Manager daemon also uses the pmdlog to indicate when a
supported_object has been set or reset in the environment.
Description of /etc/poe.security
The /etc/poe.security file is used to define the security configuration method
enacted on each node, and consists of a simple ASCII text entry.
For AIX, this text entry can have one of the following values:
1. COMPAT - where the previously defined AIX or DCE based security will be
used for compatibility (this is the default).
2. CTSEC - where the clusters based CTSec security methods are to be used.
For Linux, this text entry can have only one value: COMPAT.
The contents of the /etc/poe.security file are case insensitive, and will allow for
leading and trailing spaces, and blank lines. POE verifies that the appropriate
method specified in /etc/poe.security is configured on each node. For instance,
when CTSec is enabled, it ensures the CTSec libraries are installed.
64 IBM PE for AIX and Linux V5 R1: Installation
For AIX, the /etc/poe.security file is shipped with POE, with COMPAT as the
default.
For Linux, the /etc/poe.security file is created during installation, and there is no
need to modify its contents. POE verifies that the appropriate method is the same
and is configured on each node.
This file is owned and writable only by root, so only system administrators with
root level access can update it. The method specified must be the same throughout
all of the nodes in a parallel job - they cannot be mixed. The lack of an entry in
/etc/poe.security (or the lack of the file altogether) is an error. Only one value is
expected. If multiple values or invalid values are specified, a terminating error and
message will occur.
Configuring the Parallel Environment coscheduler
The PE coscheduler works by alternately, and synchronously, raising and lowering
the AIX dispatch priority or the Linux scheduling policy and priority values of the
tasks in an MPI job. The objective is for all the tasks to have the same priority
across all processors, and to force other system activity into periodic and aligned
time slots during which the MPI tasks do not actively compete for CPU resources.
The synchronous operation of the coscheduler is effective only if there is a global
time source. On AIX, an operational High Performance Switch provides this
capability. In other environments, the coscheduler uses the local time on the node
to determine when to change priorities.
The pmadjpri executable is shipped as a set-user-identity-on-execution binary,
which allows any system user to invoke the coscheduler on their behalf.
For AIX, there are two components of the PE coscheduler support: the POE
coscheduling parameters and limits, and the AIX dispatcher tuning parameters.
POE coscheduling parameters and limits
A coscheduler activation is specified by the following:
v The high (favored) priority value.
v The low (unfavored) priority value.
v The percentage of time that the MPI tasks will be set to their favored priority
value.
v The period of alternation.
The values above can be specified on a per-user or class basis, with limits set by
the system administrator in either case. The limits and classes are defined in the
/etc/poe.priority file. It is assumed that this file is the same on each node in the
cluster.
The range of parameters permitted in the adjustment record is purposely set to be
as unrestricted as possible. The user and system administrator (who owns the
configuration file) must evaluate the effect of various parameter settings in their
own operating environment. Carefully read the notes accompanying the file format
description. The following are descriptions of the parameters.
username
name of the user. Wildcards are allowable for the user name, in the form of
an asterisk (*). For wild card values, these will allow defaults to be set for
Chapter 7. Additional information for the system administrator 65
a user that does not have an explicit entry defined. When the file contains
an entry for a specific user, that entry constrains the values for that user,
regardless of the wild card values.
classname
name assigned to the class, to which the MP_PRIORITY value can be set.
Additionally, there can be additional constraints defined using the
MAXIMUM and MINIMUM class entries, on a first match basis, where the
first match for that user takes precedence. Also, a MAXIMUM or
MINIMUM can be defined for a particular user, meaning that user cannot
exceed those values.
hipriority
the dispatching priority assigned to the favored portion of the cycle.
lopriority
the dispatching priority assigned to the rest of the cycle.
percenthi
the percentage of the cycle at which the job is at hipriority (percent).
period length of adjustment cycle, in seconds.
For AIX users, records can be in the following format:
# user class hipri lopri percentage period
# ---- ------- ----- ----- ---------- ------
pfc special 40 100 90.5 5
* ten50 50 100 90 5
* MAXIMUM 100 100 97 10
* MINIMUM 40 60 0 1
trj ten40 40 100 90 5
For Linux users, records can be in the following format:
# user class hipri lopri percentage period
# ---- ------- ----- ----- ---------- ------
ibm special 3 0 90 5
* MAXIMUM 5 0 99 10
* MINIMUM 3 0 0 1
trj two80 2 0 80 6
Furthermore, the first match policy also applies to the case where there are multiple
entries for a user - the first entry found for that user will take precedence.
For example, considering the following entries:
# user class hipri lopri percentage period
# ---- ------- ----- ----- ---------- ------
trj MAXIMUM 90 100 97 10
* MAXIMUM 100 100 97 10
ibm MAXIMUM 80 100 97 10
v user trj cannot go above 90 for the hipri value
v everyone else (including user ibm) can go up to 100
v user ibm’s entry, because it follows the wildcarded MAXIMUM, is ignored
The MP_PRIORITY environment variable may be specified in one of two forms:
v a class name as the only value, or
v a colon separated list of values specified by the user, for the key parameters, in
the following format:
hipriority:lopriority:percentage:period
66 IBM PE for AIX and Linux V5 R1: Installation
The values specified or implied by the MP_PRIORITY variable will be evaluated
against the MAXIMUM and MINIMUM settings in the /etc/poe.priority file, and
they will only take effect under the following conditions:
v when a MAXIMUM setting is specified in the file, and each value in the
environment variable is less than or equal to the corresponding value in the file.
v when a MINIMUM setting is specified in the file, and each value in the
environment variable is greater than or equal to the corresponding value in the
file.
Comments are allowed in the file, when preceded by the # sign, such that
everything following the # will be ignored.
For AIX users, the following notes apply:
v The normal dispatching priority is 60. If both hipriority and lopriority are set to
values less than 60, a compute bound job will prevent other users from being
dispatched.
v The hipriority value must be equal to or greater than 12. If the value is between
12 and 20, the job competes with system processes for cycles, and may disrupt
normal system activity.
v If hipriority value is less than 30, keystroke capture will be inhibited during the
hipriority portion of the dispatch cycle.
v If hipriority is less than 16, the job will not be subject to the scheduler during
the high priority portion of the cycle.
v The lopriority value must be less than or equal to 254.
v If the hipriority value is less than (more favored than) the priority of the IBM
High Performance Switch fault-service daemon, and if the low priority portion
of the adjustment cycle is less than two seconds, then switch fault recovery will
be unsuccessful, and the node will be disconnected from the switch.
v The coscheduling process allows programs using the User Space library to
maximize their effectiveness in interchanging data. The process may also be used
for programs using IP, either over the switch or over another supported device.
However, if the high priority phase of the user’s program is more favored than
the network processes (typically priorities 36-39), the required IP message
passing traffic may be blocked and cause the program to hang.
v Consult the include file /usr/include/sys/pri.h for definitions of the priorities
used for normal AIX functions.
v The parameter file /etc/poe.priority defines the scheduling parameters for tasks
running on that node.
v The MP_PRIORITY_LOG environment variable and -priority_log POE
command line option may be used to log messages and diagnostic information
to the POE priority adjustment coscheduler log file, in /tmp/pmadjpri.log on
each of the remote nodes. The default value is no (disable coscheduler logging).
If you wish to store the priority adjustment log file in a location other than /tmp,
you can specify a different directory with the MP_PRIORITY_LOG_DIR
environment variable. Also, if you wish to give the file a name other than
pmadjpri.log, you can do so with the MP_PRIORITY_LOG_NAME
environment variable. See IBM Parallel Environment: Operation and Use for more
information about these environment variables.
v The MP_PRIORITY_NTP environment variable determines whether the POE
priority adjustment coscheduler will turn NTP off during the priority adjustment
period, or leave it running. The value of no (which is the default) instructs the
POE coscheduler to turn the NTP daemon off (if it was running) and restart
NTP later, after the coscheduler completes. Specify a value of yes to inform the
Chapter 7. Additional information for the system administrator 67
coscheduler to keep NTP running during the priority adjustment cycles (if NTP
was not running, NTP will not be started). The value of this environment
variable can be overridden using the -priority_ntp flag.
For Linux users, the following notes apply:
v It is important to understand the Linux real time priority scheduling policies
and the possible impacts on a system by assigning a real time priority policy to
a process. It is recommended that you consult the man pages for the following
function calls, as well as any other relevant information:
– sched_setscheduler
– sched_setparam
– sched
– sched_getpriority
– sched_getpriority_max.
v The Posix standard for scheduling priority values are 0 to 99. Processes with
numerically higher priority values are scheduled before processes with
numerically lower priority values, but this might vary depending on your Linux
distribution. The actual priority values cannot exceed the system-defined values.
v The lopriority value is ignored, however, it still requires a value to be assigned
(specify 0). The lopriority value is fixed to a scheduling policy of
SCHED_OTHER, with priority 0 and a value of 19. This is to allow other system
processes access to the CPUs during the low priority window.
v The hipriority value must be lower than 29. If this value is greater than 29, the
job competes with system processes for cycles and disrupts normal system
activity.
v If the number of tasks per node exceeds the number of CPUs available for use,
keystroke capture and interactive processes could be inhibited, causing the
system to appear inactive. The system could remain in this state until the job
completes. For more information, refer to your Linux distribution’s real-time
priority scheduling man pages.
v The default scheduling policy used by the Linux coscheduler is SCHED_RR.
v The parameter file /etc/poe.priority defines the scheduling parameters for tasks
running on that node.
v The MP_PRIORITY_LOG environment variable and -priority_log POE
command line option may be used to log messages and diagnostic information
to the POE priority adjustment coscheduler log file, in /tmp/pmadjpri.log on
each of the remote nodes. The default value is no (disable coscheduler logging).
If you wish to store the priority adjustment log file in a location other than /tmp,
you can specify a different directory with the MP_PRIORITY_LOG_DIR
environment variable. Also, if you wish to give the file a name other than
pmadjpri.log, you can do so with the MP_PRIORITY_LOG_NAME
environment variable. See IBM Parallel Environment: Operation and Use for more
information about these environment variables.
AIX dispatcher tuning (PE for AIX only)
The AIX dispatcher is tuned by setting parameters of the schedo command. The
two parameters of particular interest to the coscheduler are:
big_tick_size
Sets the scheduling time slice interval, in units of 10 milliseconds. The
default is 1 (corresponding to the normal AIX 10 millisecond time slice).
68 IBM PE for AIX and Linux V5 R1: Installation
The value can be as large as 100, which would make the interval between
dispatcher activations 1000 milliseconds (one second). The value must also
divide evenly into 100.
Between activations, tasks running on a processor are not examined for
replacement unless they do I/O or voluntarily yield the processor. Because
running the dispatcher itself takes some time, increasing the value of the
big_tick_size parameter reduces the overhead for dispatching, but may not
provide CPU cycles to some system activities as often as they would
desire.
force_grq
If enabled, assigns all processes, that are not part of a PE/MPI job, to the
Global Run Queue. The intention is to allow all non-MPI activity to
compete equally for the block of CPU resource that becomes available
periodically. Without setting this option, non-MPI processes may queue up
for the processor they used previously, even if that processor is busy and
another processor is idle.
Note:
1. These options are only fully effective if the AIX kernel is running with
the real time option, which is enabled by:
v bosdebug -R on
v bosboot -a (assuming that the existing kernel is to be used)
v shutdown -Fr (to reboot the node).
After the reboot is complete, the presence of the real time option may be
verified by displaying the value of the symbol rt_kernel from the kdb
debugger. If it is nonzero, the real time option has been successfully
enabled.
2. Setting the big_tick_size option to a value other than 1, in combination
with the real time option, has the side effect of synchronizing the
dispatcher activations on a node, so that all processor time slices end at
the same time. This is in contrast to the normal operation of the AIX
dispatcher, in which the time slice ends are deliberately offset within the
10 millisecond period to minimize contention for locks on AIX control
structures. Also, the time slice ends are synchronized to the AIX system
clock, so that one of the time slices ends at an even number of seconds.
In other words, the fractional seconds must be zero (HH:MM:SS:00). The
time-of-day synchronization of the time slices only occurs if
big_tick_size is greater than 1.
3. Returning big_tick_size to 1 does not reset this time slice offset, which
persists for the life of the kernel session.
4. Changes to big_tick_size and force_grq can only be made by a root user,
and take effect immediately without a reboot. If force_grq is set to zero,
the normal AIX mechanism of trying to reassign a process to its previous
processor is resumed.
Enabling Remote Direct Memory Access (RDMA)
Remote Direct Memory Access (RDMA) is a mechanism that allows large
contiguous messages to be transferred while reducing the message transfer
overhead. It is used with data striping and bulk data transfer.
Chapter 7. Additional information for the system administrator 69
If you are using the IBM High Performance Switch, RDMA may be used either
implicitly or explicitly. However, if you are using the InfiniBand interconnect, only
implicit RDMA is currently supported.
Enabling RDMA for use with the IBM High Performance Switch
(PE for AIX only)
If you are using the IBM High Performance Switch, RDMA may be used either
implicitly or explicitly. To use implicit RDMA, MP_USE_BULK_XFER must be set
to YES, which causes all MPI or LAPI messages that are larger than some
threshold to use the bulk transfer or implicit RDMA path. If necessary,
MP_USE_BULK_XFER can be overridden with the command line option,
-use_bulk_xfer.
Explicit RDMA is only available to LAPI programs that use the rCxt resources
requested by LoadLeveler. MP_ RDMA_COUNT is used to specify the number of
user rCxt blocks. This number represents the total number of rCxt blocks required
by the application program, and is determined by the number of remote handles
that the program requires, divided by 128 and adding 2. MP_ RDMA_COUNT
supports the specification of multiple values when multiple protocols are involved.
Note that the MP_RDMA_COUNT/–rdma_count option signifies the number of
rCxt blocks the user has requested for the job, and LoadLeveler determines the
actual number of rCxt blocks that will be allocated for the job. POE will use the
value of MP_RDMA_COUNT to specify the number of rCxt blocks requested on
the LoadLeveler MPI and/or LAPI network information when the job is submitted.
The number of rCxt blocks will be the same for every window of the same
protocol.
See the section on using RDMA in Parallel Environment for AIX: Operation & Use for
more detailed information.
Note that the values of MP_RDMA_COUNT and MP_USE_BULK_XFER are only
significant for interactive jobs. For jobs that are submitted directly to LoadLeveler,
the LoadLeveler keywords take precedence.
Before RDMA can be used with POE, the administrator and the end user need to
perform the following tasks:
v Set the SCHEDULE_BY_RESOURCES = RDMA keyword, in the LoadLeveler
configuration file. SCHEDULE_BY_RESOURCES specifies which consumable
resources are considered by the LoadLeveler schedulers. For more information,
see IBM LoadLeveler: Using and Administering.
Note that you can confirm which nodes have been enabled by using the
LoadLeveler command llstatus -R. In the following example output for the
llstatus -R command, the f4rp02 node is not enabled for RDMA:
a [f4rp02]kgoin>llstatus -R
Machine Consumable Resource(Available, Total)
------------------------------ -------------------------------------------------
f3rp01.ppd.pok.ibm.com RDMA(4,4)+<
f3rp02.ppd.pok.ibm.com
f4rp03.ppd.pok.ibm.com suiteshare(16,16) RDMA(4,4)+<
f4rp04.ppd.pok.ibm.com RDMA(4,4)+<
70 IBM PE for AIX and Linux V5 R1: Installation
Resources with "+" appended to their names have the Total value reported from St
artd.
Resources with "<" appended to their names were created automatically.
a [f4rp02]kgoin>
In addition to the system administration tasks, the user must also enable bulk
transfer, as follows:
If you are an interactive user, set the MP_USE_BULK_XFER environment variable
to yes:
MP_USE_BULK_XFER=yes
The default setting for MP_USE_BULK_XFER is no. See IBM Parallel Environment
for AIX: Operation and Use for more information about setting
MP_USE_BULK_XFER.
If you are a batch JCF user, specify:
# @ bulkxfer = true
Enabling RDMA for use with the InfiniBand interconnect
PE supports implicit RDMA over the InfiniBand interconnect with either the AIX
or Linux operating system. PE requires the use of Reliable Connected Queue Pairs
(RC QPs) to establish adapter resources, and LoadLeveler manages those resources
on behalf of the application. POE interacts with LoadLeveler to determine the
resources allocated, and then passes that information to MPI and LAPI. For more
information, see IBM Parallel Environment: Operation and Use.
The system administrator must perform the following tasks to enable the use of
RDMA over the InfiniBand interconnect:
v Set the SCHEDULE_BY_RESOURCES = RDMA keyword, in the LoadLeveler
configuration file. SCHEDULE_BY_RESOURCES specifies which consumable
resources are considered by the LoadLeveler schedulers. For more information,
see IBM LoadLeveler: Using and Administering.
Note that you can confirm which nodes have been enabled by using the
LoadLeveler command llstatus -R. In the following example output for the
llstatus -R command, the f4rp02 node is not enabled for RDMA:
a [f4rp02]kgoin>llstatus -R
Machine Consumable Resource(Available, Total)
------------------------------ -------------------------------------------------
f3rp01.ppd.pok.ibm.com RDMA(4,4)+<
f3rp02.ppd.pok.ibm.com
f4rp03.ppd.pok.ibm.com suiteshare(16,16) RDMA(4,4)+<
f4rp04.ppd.pok.ibm.com RDMA(4,4)+<
Resources with "+" appended to their names have the Total value reported from
Startd.
Resources with "<" appended to their names were created automatically.
a [f4rp02]kgoin>
After the system administrator has enabled RDMA, users must perform the
following tasks to use it. For more information, see IBM Parallel Environment:
Operation and Use.
v Verify that MP_DEVTYPE is set to ib.
v Request the use of bulk transfer using either the MP_USE_BULK_XFER
environment variable or the LoadLeveler JCF #@ bulkxfer keyword.
Chapter 7. Additional information for the system administrator 71
v Set the minimum message length for bulk transfer with the
MP_BULK_MIN_MSG_SIZE environment variable.
v Set the MP_RC_MAX_QP and MP_RC_USE_LMC environment variables, as
appropriate for the installation.
Configuring InfiniBand for User Space without LoadLeveler (PE for AIX
only)
System administrators can use the Network Resource Table (NRT) application
programming interfaces to configure the system to allow User Space jobs to run
without LoadLeveler.
In addition to the NRT API, PE includes a set of sample programs for your use.
The sample programs provide a simple example of how POE-MPI or POE-LAPI
User Space jobs can be started without LoadLeveler. These sample programs are
located in the /usr/lpp/ppe/poe/samples/nrt directory. These sample programs do
not suggest the only way or the best way of using the Network Resource Table
(NRT) Application Programming Interfaces (API); they serve as one way to use the
NRT APIs in an alternative or test environment, with other resource managers.
Note that the NRT API and sample programs are only intended for use with the
InfiniBand interconnect on AIX.
Warning: Be very careful when running the sample code. The system administrator
should carefully monitor the use of these programs, particularly nrt_api, which
may be used to load and unload network tables. It is suggested that you use these
programs on a set of nodes that have been set aside for testing.
Compiling and installing the NRT API samples
Before you can run MPI or LAPI User Space applications, without LoadLeveler, the
system administrator must first compile the NRT API samples and make them
available for use. This step only needs to be done once on each node.
Note that the system administrator may not choose to make the NRT API samples
generally available.
At a high level, compiling and installing the NRT API samples includes the
following steps:
1. Locate the sample programs, which are in the /usr/lpp/ppe.poe/samples/nrt
directory. The set of sample programs includes a C program, makefile, shell
script, and readme file.
2. Use the makefile to build the program called nrt_api.
3. Install the executable in a convenient location, that is typically in a user’s path.
4. Change the nrt_api executable to a set-user-identity-on-execution binary, with
owner set as root (chmod 4755 nrt_api). This allows any system user to invoke
the utility to load or unload the network resource tables on their behalf. Note
that the system administrator must perform this task.
The POE NRT API samples comprise just one method for loading network table
resources for User Space jobs in the absence of LoadLeveler. You may choose to
create your own mechanism, that uses the externally published NRT API
interfaces, for loading network table resources in place of LoadLeveler. The same
general principles apply to programs you write for loading and unloading the
network tables:
72 IBM PE for AIX and Linux V5 R1: Installation
v The system administrator should carefully monitor their use.
v They must be defined as set-user-identity-on-execution binary programs, with
owner set as root (chmod 4755 my_nrt_api_pgm). This allows any system user to
invoke the program to load or unload the network resource tables on their
behalf.
v They must be available in the user’s executable path.
v These samples are simple programs intended to illustrate the some of the ways
in which you can take advantage of RSCT’s NRT APIs for alternative resource
managers. These programs should not be used in a production environment:
they serve only as a guide for customers to develop and test their own programs
that will utilize the NRT APIs.
v Under no circumstances should these sample programs and scripts be packaged,
combined, or redistributed with third party products outside of IBM or IBM’s
Parallel Environment for AIX product. Customers assume all risks and technical
support for modified versions of these samples, and IBM makes no guarantee
against changes that may affect migration or coexistence with future IBM
hardware or software products.
v These samples are not intended as programming interfaces; therefore, users of
the samples cannot expect continued or ongoing support for the sample
programs. The samples may be changed or discontinued at any time in the
future.
For more information on the Network Table APIs, see IBM Reliable Scalable Cluster
Technology: NRT API Programming Guide.
Chapter 7. Additional information for the system administrator 73
74 IBM PE for AIX and Linux V5 R1: Installation
Chapter 8. Syntax of commands for running installation and
deinstallation scripts
PE provides two scripts for installing and deinstalling Parallel Environment; one
for AIX and one for Linux.
If you are using PE for AIX, you can use PEinstall to install the PE filesets on IBM
Power Systems nodes. You can also use PEdeinstall to automatically remove all of
the PE filesets that were previously installed.
If you are using PE for Linux, you can use pe_install.sh to perform new
installations, upgrade installations, or installation fixes for various RPM packages
provided with PE. You can also use pe_deinstall.sh to automatically remove all
installed PE and LAPI RPMs, including the license RPM.
PE for AIX installation script: PEinstall
You can use the PEinstall script to install the PE filesets on IBM Power Systems
nodes using the Remote Shell (rsh).
To run the PEinstall script, first set up a host list file of all nodes on which you
want to install a particular fileset. You must have /usr resident. The PEinstall script
either mounts or copies the installation image to each node in the list, and then
executes the proper installp command to install the product, including
automatically accepting the product license.
The PEinstall script has one required parameter and two optional parameters. The
syntax is:
PEinstall image_name [host_list_file] [-copy | -mount]
Where:
image_name
is required. It specifies the name of the file that contains the installp image,
of which the PE fileset is a part.
host_list_file
is optional and specifies the name of file containing the list of nodes on
which you want to install the fileset. The default file name is host.list in
the current working directory. If a host list file cannot be found, the script
exits with an error message.
You can specify either -copy or -mount to tell PEinstall to copy or mount the
installation image to each node. The default is -copy.
Copying the installation image
Using the -copy option (or allowing it as the default) informs PEinstall to copy the
named image to each node using rcp. You are prompted for the following
information when you specify -copy (or defaulted):
v The installation image source directory. The default is /usr/sys/inst.images.
© Copyright IBM Corp. 1993, 2008 75
v The installation image destination directory which is used for all nodes in the
node list. The default is /usr/sys/inst.images.
Note: To have the image copied to different directories, invoke PEinstall for
each different location or set of locations. Your host.list file should reflect
only those nodes that you want to use with -copy.
The image is copied to the destination directory with the name specified as the
image_name parameter. Be sure there is enough space in the destination directory
file system for the image. Each image occupies approximately three megabytes.
Mounting the installation image
Specifying the -mount option informs PEinstall to mount the named image to each
node using rsh. You are prompted for the following information when you specify
-mount:
v The installation image source directory. The default is /usr/sys/inst.images.
v The remote node mount point directory. This is used for all nodes in the node
list. The default is /mnt.
To have the image mounted to different directories, invoke PEinstall for each
different location or set of locations. Your host.list file should reflect only those
nodes that you want to use with -mount.
v When mounting the image, PEinstall also asks if you want to create the remote
mount directory. If your remote mount directory already exists, answer no to
this prompt.
PEinstall issues a mkdir command for the directory name specified, followed by a
chmod 777. To execute the installp remotely on a mounted image, the directory
containing the image needs to have this permission.
To avoid creating the directory with world-writable permissions, do not use the
-mount option of PEinstall.
PE for AIX deinstallation script: PEdeinstall
When you install a PE fileset, you do so first on a single node (or the control
workstation). Then, you either copy or mount the installation image to the
additional nodes in your system. When you remove a fileset completely from your
system, you do the opposite.
To remove a fileset completely:
v First you remove the fileset from the other nodes in your system, using the
PEdeinstall script.
v Then you remove the fileset from the initial installation node (or control
workstation).
Removing an installation of a fileset removes all files already installed for that
fileset. As a result, the PEdeinstall script will be removed from each node the
installp -u command is run against. For this reason, you may want to consider
copying PEdeinstall from /usr/lpp/ppe.poe/bin to another location before rejecting
the installation of the fileset. However, if you follow the previously mentioned
sequence of removing a fileset from the other nodes first, and then removing it
from the initial node last, these scripts will remain available until the fileset is
removed from the initial node.
76 IBM PE for AIX and Linux V5 R1: Installation
PEdeinstall issues the proper installp command using the Remote Shell (rsh).
The PEdeinstall script has the following syntax:
PEdeinstall image_name [host_list_file]
Where:
image_name
is required, and specifies the file name of the installp image you want
removed.
host_list_file
is optional and specifies the name of the file containing the list of nodes
from which you want the image removed. The default file name is host.list
in the current working directory. If this file cannot be found, the script
exits with an error message.
For each node, PEdeinstall issues the following installp command:
installp -ugX image_name
This command removes both the user and root portions of all the products in the
image specified.
If there is a problem removing an installed product on a node, an error message is
listed and logged in a file named PEnode.log in the current working directory. The
product removals continue for the remaining nodes.
PE for Linux installation script: pe_install.sh
The pe_install.sh script allows you to perform new installations, upgrade
installations, or installation fixes for various RPM packages that comprise the IBM
Parallel Environment for Linux.
This pe_install.sh script can be used in interactive mode or in batch (script) mode.
Interactive mode is meant for local installations, and when doing a new
installation, you have the option of installing all PE and LAPI RPMs, including the
PE license RPM, or just the PE license RPM alone. When used in batch mode, this
script attempts to install all PE and LAPI RPMs, including the PE license RPM, and
allows you to perform remote installations when used in conjunction with other
utilities like dsh, rsh, or ssh. You must be a root level user to run this script.
When used interactively, this script prompts for all installation options. Before
installing IBM PE on a new cluster, you should run this script interactively one
time to review the license agreement. You can do this by entering y at the
following prompt:
Do you want to review the PE License Agreement and manually register your
acceptance of the terms, or just have your acceptance automatically registered
for you without reviewing the license agreement?
If you accept the IBM PE license agreement, you can run this script again in batch
mode in conjunction with another utility (such as dsh) to install PE on the full
cluster across the network.
When used in batch mode, there is a command line option that corresponds to
each of the interactive mode prompts, except:
Chapter 8. Syntax of commands for running installation and deinstallation scripts 77
Do a full installation of IBM Parallel Environment for Linux?
and
Do you want to review the PE License Agreement and manually register your
acceptance of the terms, or just have your acceptance automatically registered
for you without reviewing the license agreement?
This is because in batch mode, all PE and LAPI RPMs are installed, and the script
assumes that you reviewed the terms of the license agreement and accept them.
The command line options are:
[-a] to run the script interactively, if -a is not followed by a value. If specified
with a value of n, or if -a is not specified, the script is run in batch mode.
[-h] to print a help message.
The following are additional options that may be specified while running in batch
mode, and are specified when invoking the script. Each of these options has a
default value that is used if the option is not specified.
[-dir path]
directory path where the PE RPMs are mounted. The default path is the
current directory.
[-install_op {i | U}]
specifies an RPM installation operation. The operations are i for new
install, and U for update. The default is i. Each option performs the
requested installation on the full set of PE RPMs. To install a fix package to
a set of RPMs for a particular component, specify this option with a value
of U, along with the -fix_level option.
[-fix_level comp_name-ver.rel.mod.fix-build_level]
specifies the fix package to install by component name and fix level. The
comp_name portion of this string is either pe or lapi. The fix level portion is
specified in the format ver.rel.mod.fix-build_level. For example:
5.1.0.0-0610a
Note that the comp_name portion of the string and the fix level portion are
separated by a hyphen. Specifying this option without also using the
-install_op option implies using -install_op with the U install operation
value.
[-ip_only {y | n}]
specifies IP support or both IP and US support. The default value is n for
installing both IP and US support. Specify y to install IP support only.
PE for Linux deinstallation script: pe_deinstall.sh
The pe_deinstall.sh deinstallation script automatically removes all installed PE and
LAPI RPMs, including the PE license RPM. Deinstallation begins immediately and
occurs automatically.
Only a root level user can run this script. The pe_deinstall.sh script may be
invoked in conjunction with other utilities such as dsh, rsh, or ssh to remove PE
from the full cluster across the network.
The following RPMs are removed from the node on which this script is executed:
v 32-bit and 64-bit PE RPMs
78 IBM PE for AIX and Linux V5 R1: Installation
v 32-bit and 64-bit IP RPMS, 32-bit and 64-bit User Space RPMs, or both
v IBM PE license RPM
The deinstallation script has one flag:
[-keep]
remove all PE components except the license RPM. Use this flag to remove
old release levels before installing a new service level.
Chapter 8. Syntax of commands for running installation and deinstallation scripts 79
80 IBM PE for AIX and Linux V5 R1: Installation
Chapter 9. Installation verification program summary
The POE Installation Verification Program (IVP) is an ideal way to determine if
you have set up your system correctly before running your applications.
With PE for AIX, the IVP is located in the /usr/lpp/ppe.poe/samples/ivp directory,
and is invoked by the ivp.script shell script. With PE for Linux, the IVP is located
in the /opt/ibmhpc/ppe.poe/samples/ivp directory, and is invoked by the
ivp.script.linux shell script.
The IVP checks for the needed files and libraries and makes sure that everything is
in order. It also issues messages when it finds something wrong.
You need the following in order to run the IVP:
v A nonroot userid that is properly authorized in /etc/hosts.equiv or the local
.rhosts file.
v Access to a C compiler.
If the previous conditions are true, the IVP does the following:
1. Verifies that:
v poe, pmdv5, mpcc, and mpcc_r (PE for AIX only) are there, and are
executable.
v The mpcc and mpcc_r scripts are in the path.
v The /etc/services file contains an entry for pmv5 (the Partition Manager
daemon).
v For PE for AIX, the /etc/inetd.conf file contains an entry for pmv5, and that
the daemon to which it points is executable.
v For PE for Linux, the /etc/xinetd.d/pmv5 file exists and contains an entry for
/etc/pmdv5.2. Creates a working directory in /tmp/ivppid to compile and run sample
programs.
Note that pid is the process id.
v Compiles sample programs.
v Creates a host.list file with local host names listed twice.
v Runs sample programs using Internet Protocol (IP) on two tasks, using both
threaded and non-threaded libraries.
v Removes all files from /tmp, as well as the temporary working directory.
If you are using PE for AIX, refer to “Step 4: Verify the POE installation” on page
26 for specific steps on verifying the installation. If you are using PE for Linux,
refer to “Verifying the POE installation” on page 35 for specific steps on verifying
the installation.
© Copyright IBM Corp. 1993, 2008 81
82 IBM PE for AIX and Linux V5 R1: Installation
Chapter 10. Using additional POE sample applications
PE provides POE sample applications for measuring the MPI point-to-point
communication bandwidth between two tasks, broadcasting from task 0 to the all
of the other nodes in the partition, and for using the MPI message passing library
with user-created threads.
In order to be able to run the POE sample applications with PE for AIX, POE must
be fully installed and rsct.lapi.rte is also required. In order to be able to run the
POE sample applications with PE for Linux, all 32-bit and 64-bit PE and LAPI
RPMs must be installed.
Bandwidth measurement test sample
The purpose of this sample is to measure the MPI point-to-point communication
bandwidth between two tasks.
For PE for AIX, the sample code is in the directory called /usr/lpp/ppe.poe/samples/poetest.bw. For PE for Linux, the sample code is in the directory called
/opt/ibmhpc/ppe.poe/samples/poetest.bw. This directory contains a test application
called bw.f, which does a point-to-point bandwidth measurement test. The code
needs only two nodes to run.
You should have the following files:
README.bw
Readme file containing instructions on running the sample application,
which is the same information presented here.
bw.f Sample application Fortran source file.
bw.run
Script for compiling and executing the sample application.
makefile (PE for AIX only)
Makefile for creating the sample application.
makefile.linux (PE for Linux only)
Makefile for creating the sample application.
The C and Fortran compilers must be available.
Verification steps
Follow these steps to verify your system:
1. Create the bw executable. Log in as a nonroot user and perform the following
steps:
ENTER (PE for AIX only)
cd /usr/lpp/ppe.poe/samples/poetest.bw to switch to the appropriate
directory. If you do not have write access to this directory, copy the
needed files from here to a directory that is writable.
ENTER (PE for Linux only)
cd /opt/ibmhpc/ppe.poe/samples/poetest.bw to switch to the
© Copyright IBM Corp. 1993, 2008 83
appropriate directory. If you do not have write access to this directory,
copy the needed files from here to a directory that is writable.
ENTER (PE for AIX only)
make to invoke the makefile, which compiles bw.f and creates the bw
executable.
ENTER (PE for Linux only)
make -f makefile.linux to compile bw.f and create the bw executable.2. Create a file that lists the names of the nodes to be used for program execution.
CREATE
a file named host.list and edit the file to add two entries, one per line.
The entries should list the two nodes on which the executable is to run.3. Run the bw executable. The bw.run script compiles bw.f, if not already
compiled, and runs the bw executable from the current working directory.
ENTER
./bw.run [ css_library ]
where:
css_library
is us for User Space message passing or ip for IP message
passing.4. Check your output.
VERIFY
your output by comparing it to the following output. The output
should finish in about one minute, using the User Space message
passing library. The execution time for IP is five minutes or longer. The
actual response time depends on your LAN traffic.
Input: none
Output to terminal by this program: (Note that the order is
unpredictable.)
Hello from node 0
Hello from node 1
MEASURED BANDWIDTH = ....... MB/sec
Broadcast test sample
The purpose of this sample is to perform a broadcast from task 0 to the rest of the
nodes running this program.
For AIX, this sample is in the directory called /usr/lpp/ppe.poe/samples/poetest.cast. For Linux, this sample is in the directory called /opt/ibmhpc/ppe.poe/samples/poetest.cast. This sample test code touches all nodes in the partition.
You should have the following files:
README.cast
Readme file containing instructions on running the sample application,
which is the same information presented here.
bcast.f Sample application Fortran source.
makefile
Makefile for compiling the sample application.
84 IBM PE for AIX and Linux V5 R1: Installation
makefile.linux
Makefile for compiling the sample application.
bcast.run.
Script for compiling and executing the sample application.
The Fortran compiler must be available.
Verification steps
Follow these steps to verify your system:
1. Create the bcast executable. Log in as a nonroot user and follow these steps:
ENTER
cd /usr/lpp/ppe.poe/samples/poetest.cast (AIX) or cd
/opt/ibmhpc/ppe.poe/samples/poetest.cast (Linux) to switch to the
appropriate directory. If you do not have write access to this directory,
copy the needed files from here to a directory that is writable.
ENTER (PE for AIX only)
make to invoke the makefile, which compiles bcast.f, to create the
executable.
ENTER (PE for Linux only)
make -f makefile.linux to compile bcast.f, to create the bcast
executable.2. Create a file that lists the names of nodes to be used for program execution.
CREATE
a file named host.list and edit it by adding the names of the nodes on
which to execute this program, with one entry per line.3. Run the bcast executable. The bcast.run script compiles bcast.f, if not already
compiled, and runs the bcast executable from the current working directory.
ENTER
./bcast.run ntasks [ css_library ]
where the required parameter is the following:
ntasks the number of tasks (nodes) in the partition.
Make sure that there are at least ntasks entries in the host.list
file.
and the optional parameter is:
css_library
us for User Space message passing (default) or ip for IP
message passing.4. Check your output.
VERIFY
your output by comparing it with the following output. The output
should finish in about one minute if your system does not have more
than 64 nodes. The actual response time depends on your LAN traffic.
Note that the order of these lines is unpredictable.
Input: none
Output to terminal by this program:
Hello from node 0
Hello from node 1
Chapter 10. Using additional POE sample applications 85
...
Hello from node (p-1)
BROADCAST TEST COMPLETED SUCCESSFULLY
If the test did not succeed, you should see the following message on
the terminal:
BROADCAST TEST FAILED on node x (where x is some integer)
For every node that did not pass the test, a line similar to the previous
line appears.
MPI threads sample program
The purpose of this sample program is to illustrate the use of the MPI message
passing library with user-created threads.
If you are using PE for AIX, you can find the sample program in the
/usr/lpp/ppe.poe/samples/threads directory. If you are using PE for Linux, you can
find the sample program in the /opt/ibmhpc/ppe.poe/samples/threads directory.
You should have the following files:
README.threads
Readme file containing instructions on running the sample program.
threaded_ring.c
Sample program source file for testing threaded MPI library with user
threads.
makefile (PE for AIX only)
Makefile for compiling the threaded sample program.
makefile.linux (PE for Linux only)
Makefile for compiling the threaded sample program.
threads.run
Script for compiling and executing the user threads sample program,
threaded_ring.
The C compiler must be available.
Verification steps
Follow these steps to run the sample threads application on your system:
1. Create the executables by logging in as a nonroot user, and doing the
following:
ENTER (PE for AIX only)
cd /usr/lpp/ppe.poe/samples/threads to switch to the appropriate
directory. If you do not have write access to this directory, copy the
needed files from here to a directory that is writable.
ENTER (PE for Linux only)
cd /opt/ibmhpc/ppe.poe/samples/threads to switch to the appropriate
directory. If you do not have write access to this directory, copy the
needed files from here to a directory that is writable.
ENTER (PE for AIX only)
make invoke the makefile, which compiles both source programs to
create the executable, threaded_ring.
86 IBM PE for AIX and Linux V5 R1: Installation
ENTER (PE for Linux only)
make -f makefile.linux to compile the source program to create the
executable, threaded_ring.2. Create a file that lists the names of nodes to be used for program execution.
CREATE
a file named host.list and edit it by adding the names of the nodes on
which to execute this program, with one entry per line.3. Run the threaded_ring executable. The threads.run script compiles
threaded_ring.c,if not already compiled, and runs the threaded_ring executable
from the current working directory.
ENTER
threads.run [ css_library ]
where:
css_library
specifies the library to use. Type ip to use the UDP/IP library.
Type us to use the User Space library. These names are
case-sensitive. User Space is the default.The program should issue only the message ″TEST COMPLETE″ from task 0.
0:TEST COMPLETE
LAPI sample programs
Several sample programs exist that illustrate the use of the low-level applications
programming Interface (LAPI). You can use these files to help you with the LAPI
programs you create to solve more complex problems.
The LAPI sample programs are structured to provide you with a detailed look at
basic LAPI operations. Refer to the RSCT: LAPI Programming Guide for specific
details.
Chapter 10. Using additional POE sample applications 87
88 IBM PE for AIX and Linux V5 R1: Installation
Chapter 11. Parallel Environment port usage
The port information provided, for both AIX and Linux, includes the service name,
port number, protocol, and source port range.
Refer to “PE for AIX port usage” and “PE for Linux port usage” for Parallel
Environment port information.
PE for AIX port usage
The port information provided for PE for AIX users includes the service name,
port number, protocol, and source port range.
Table 30 describes the port usage details for PE for AIX.
Table 30. PE for AIX port usage
Service name Port number Protocol Source port range Required or optional
pmv5 6128 TCP Not applicable Required
dish 8800 TCP Not applicable Required
The service names in Table 30 are defined as follows:
pmv5 Partition Manager daemon inetd service for systems that use the
InfiniBand switch.
dish DISH is an interactive tool that serves as a control center to multiple
distributed copies of a client, and can be configured into a distributed
shell, a parallel debugger, or some other interactive program. Note that
DISH is installed with the Parallel Debugging Tool (PDB).
When POE is installed, an entry is added in /etc/services and in /etc/inetd.conf to
describe the partition manager daemon. The entry that is added to /etc/services
defines a port number used by pmdv5 to communicate with the POE process on
the home node. PE attempts to use port number 6128. However if this port is
already in use, then PE will try to use port 6129 and so forth. As a result, the port
number selected may not be the same for all nodes of a cluster. In the event that
some of the nodes cannot communicate with other nodes, check the /etc/services
file to make sure that all nodes use the same port number.
PE for Linux port usage
The port information provided for PE for Linux users includes the service name,
port number, protocol, and source port range.
Table 31 describes the port usage details for PE for Linux.
Table 31. PE for Linux port usage
Service name Port number Protocol Source port range Required or optional
pmv5 6128 TCP Not applicable Required
dish 8800 TCP Not applicable Required
© Copyright IBM Corp. 1993, 2008 89
The service names in Table 31 on page 89 are defined as follows:
pmv5 Partition Manager daemon inetd service.
dish DISH is an interactive tool that serves as a control center to multiple
distributed copies of a client, and can be configured into a distributed
shell, a parallel debugger, or some other interactive program. Note that
DISH is installed with the Parallel Debugging Tool (PDB).
When PE is installed, it adds an entry to /etc/services and creates
/etc/xinetd.d/pmv5 to describe the PE partition manager (pmdv5) daemon. The
entry that is added to /etc/services defines a port number used by pmdv5 to
communicate with the POE process on the home node. PE attempts to use port
number 6128. However if this port is already in use, then PE will try to use port
6129 and so forth. As a result, the port number selected may not be the same for
all nodes of a cluster. In the event that some of the nodes cannot communicate
with other nodes, check the /etc/services file to make sure that all nodes use the
same port number.
90 IBM PE for AIX and Linux V5 R1: Installation
Appendix. Accessibility features for Parallel Environment
Accessibility features help users who have a disability, such as restricted mobility
or limited vision, to use information technology products successfully.
Accessibility features
The following list includes the major accessibility features in Parallel Environment:
v Keyboard-only operation
v Interfaces that are commonly used by screen readers
v Keys that are discernible by touch but do not activate just by touching them
v Industry-standard devices for ports and connectors
v The attachment of alternative input and output devices
The IBM Cluster information center, and its related publications, are
accessibility-enabled. The accessibility features of the information center are
described at http://publib.boulder.ibm.com/infocenter/clresctr/vxrx/topic/com.ibm.cluster.addinfo.doc/access.html.
IBM and accessibility
See the IBM Human Ability and Accessibility Center for more information about
the commitment that IBM has to accessibility:
http://www.ibm.com/able
© Copyright IBM Corp. 1993, 2008 91
92 IBM PE for AIX and Linux V5 R1: Installation
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user’s responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.
For license inquiries regarding double-byte character set (DBCS) information,
contact the IBM Intellectual Property Department in your country or send
inquiries, in writing, to:
IBM World Trade Asia Corporation
Licensing
2-31 Roppongi 3-chome, Minato-ku
Tokyo 106-0032, Japan
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law:
INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS
PUBLICATION “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER
EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS
FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or
implied warranties in certain transactions, therefore, this statement may not apply
to you.
This information could include technical inaccuracies or typographical errors.
Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.
© Copyright IBM Corp. 1993, 2008 93
IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
For AIX:
IBM Corporation
Department LRAS, Building 003
11400 Burnet Road
Austin, Texas 78758–3498
U.S.A
For Linux:
IBM Corporation
Department LJEB/P905
2455 South Road
Poughkeepsie, NY 12601-5400
U.S.A
Such information may be available, subject to appropriate terms and conditions,
including in some cases, payment of a fee.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environments may
vary significantly. Some measurements may have been made on development-level
systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurement may have been
estimated through extrapolation. Actual results may vary. Users of this document
should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of
those products, their published announcements or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
All statements regarding IBM’s future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
94 IBM PE for AIX and Linux V5 R1: Installation
This information contains sample application programs in source language, which
illustrates programming techniques on various operating platforms. You may copy,
modify, and distribute these sample programs in any form without payment to
IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating
platform for which the sample programs are written. These examples have not
been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or
imply reliability, serviceability, or function of these programs.
Each copy or any portion of these sample programs or any derivative work, must
include a copyright notice as follows:
© (your company name) (year). Portions of this code are derived from IBM Corp.
Sample Programs. © Copyright IBM Corp. _enter the year or years_. All rights
reserved.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of
International Business Machines Corporation in the United States, other countries,
or both. If these and other IBM trademarked terms are marked on their first
occurrence in this information with a trademark symbol (® or
™), these symbols
indicate U.S. registered or common law trademarks owned by IBM at the time this
information was published. Such trademarks may also be registered or common
law trademarks in other countries. A current list of IBM trademarks is available on
the Web at ″Copyright and trademark information″ at www.ibm.com/legal/copytrade.shtml.
InfiniBand is a trademark and/or service mark of the InfiniBand Trade Association.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the
United States, other countries, or both.
Linux is a trademark of Linus Torvalds in the United States, other countries, or
both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Other company, product, and service names may be trademarks or service marks
of others.
Notices 95
96 IBM PE for AIX and Linux V5 R1: Installation
Glossary
This glossary defines technical terms used in the
IBM Parallel Environment documentation. If you
do not find the term you are looking for, refer to
the IBM Terminology site on the World Wide
Web:
http://www.ibm.com/software/globalization/terminology/index.html
A
address. A unique code or identifier for a register,
device, workstation, system, or storage location.
API. application programming interface (API): An
interface that allows an application program that is
written in a high-level language to use specific data or
functions of the operating system or another program.
application. One or more computer programs or
software components that provide a function in direct
support of a specific business process or processes.
argument. A value passed to or returned from a
function or procedure at run time.
authentication. The process of validating the identity
of a user or server.
authorization. The process of obtaining permission to
perform specific actions.
B
bandwidth. A measure of frequency range, typically
measured in hertz. Bandwidth also is commonly used
to refer to data transmission rates as measured in bits
or bytes per second.
blocking operation. An operation that has not
completed until the operation either succeeds or fails.
For example, a blocking receive will not return until a
message is received or until the channel is closed and
no further messages can be received.
breakpoint. A place in a program, specified by a
command or a condition, where the system halts
execution and gives control to the workstation user or
to a specified program.
broadcast. The simultaneous transmission of data to
more than one destination.
C
C. A programming language designed by Bell Labs in
1972 for use as the systems language for the UNIX
operating system.
C++. An enhancement of the C language that adds
features supporting object-oriented programming.
client. A software program or computer that requests
services from a server.
cluster. A group of processors interconnected through
a high-speed network that can be used for
high-performance computing.
collective communication. A communication
operation that involves more than two processes or
tasks. Broadcasts and reductions are examples of
collective communication operations. All tasks in a
communicator must participate.
communicator. A Message Passing Interface (MPI)
object that describes the communication context and an
associated group of processes.
compile. translate all or part of a program expressed
in a high-level language into a computer program
expressed in an intermediate language, an assembly
language, or a machine language.
condition. One of a set of specified values that a data
item can assume.
core dump. A process by which the current state of a
program is preserved in a file. Core dumps are usually
associated with programs that have encountered an
unexpected, system-detected fault, such as a
segmentation fault or a severe user error. A
programmer can use the core dump to diagnose and
correct the problem.
core file. A file that preserves the state of a program,
usually just before a program is terminated because of
an unexpected error. See also core dump.
D
data parallelism. A situation in which parallel tasks
perform the same computation on different sets of data.
debugger. A tool used to detect and trace errors in
computer programs.
distributed shell (dsh). A Cluster Systems
Management (CSM) command that lets you issue
© Copyright IBM Corp. 1993, 2008 97
commands to a group of hosts in parallel. See IBM
Cluster Systems Management: Command and Technical
Reference for details.
E
environment variable. (1) A variable that defines an
aspect of the operating environment for a process. For
example, environment variables can define the home
directory, the command search path, the terminal in
use, or the current time zone. (2) A variable that is
included in the current software environment and is
therefore available to any called program that requests
it.
Ethernet. A packet-based networking technology for
local area networks (LANs) that supports multiple
access and handles contention by using Carrier Sense
Multiple Access with Collision Detection (CSMA/CD)
as the access method. Ethernet is standardized in the
IEEE 802.3 specification.
executable program. A program that can be run as a
self-contained procedure. It consists of a main program
and, optionally, one or more subprograms.
execution. The process of carrying out an instruction
or instructions of a computer program by a computer.
F
fairness. A policy in which tasks, threads, or processes
must eventually gain access to a resource for which
they are competing. For example, if multiple threads
are simultaneously seeking a lock, no set of
circumstances can cause any thread to wait indefinitely
for access to the lock.
Fiber Distributed Data Interface (FDDI). An
American National Standards Institute (ANSI) standard
for a 100-Mbps LAN using fiber optic cables.
file system. The collection of files and file
management structures on a physical or logical mass
storage device, such as a diskette or minidisk.
fileset. (1) An individually-installable option or
update. Options provide specific function, and updates
correct an error in, or enhance, a previously installed
program. (2) One or more separately-installable,
logically-grouped units in an installation package. See
also licensed program and package.
FORTRAN. A high-level programming language used
primarily for scientific, engineering, and mathematical
applications.
G
GDB. An open-source portable debugger supporting
Ada, C, C++, and FORTRAN. GDB is a useful tool for
determining why a program crashes and where, in the
program, the problem occurs.
global max. The maximum value across all processors
for a given variable. It is global in the sense that it is
global to the available processors.
global variable. A symbol defined in one program
module that is used in other program modules that are
independently compiled.
graphical user interface (GUI). A type of computer
interface that presents a visual metaphor of a
real-world scene, often of a desktop, by combining
high-resolution graphics, pointing devices, menu bars
and other menus, overlapping windows, icons and the
object-action relationship.
GUI. See graphical user interface.
H
high performance switch. A high-performance
message-passing network that connects all processor
nodes.
home node. The node from which an application
developer compiles and runs a program. The home
node can be any workstation on the LAN.
host. A computer that is connected to a network and
provides an access point to that network. The host can
be a client, a server, or both a client and server
simultaneously.
host list file. A file that contains a list of host names,
and possibly other information. The host list file is
defined by the application that reads it.
host name. The name used to uniquely identify any
computer on a network.
I
installation image. A copy of the software, in backup
format, that the user is installing, as well as copies of
other files the system needs to install the software
product.
Internet. The collection of worldwide networks and
gateways that function as a single, cooperative virtual
network.
Internet Protocol (IP). A protocol that routes data
through a network or interconnected networks. This
protocol acts as an intermediary between the higher
protocol layers and the physical network.
IP. Internet Protocol.
98 IBM PE for AIX and Linux V5 R1: Installation
K
kernel. The part of an operating system that contains
programs for such tasks as input/output, management
and control of hardware, and the scheduling of user
tasks.
L
latency. The time from the initiation of an operation
until something actually starts happening (for example,
data transmission begins).
licensed program. A separately priced program and
its associated materials that bear a copyright and are
offered to customers under the terms and conditions of
a licensing agreement.
lightweight core files. An alternative to standard AIX
core files. Core files produced in the Standardized
Lightweight Corefile Format provide simple process
stack traces (listings of function calls that led to the
error) and consume fewer system resources than
traditional core files.
LoadLeveler pool. A group of resources with similar
characteristics and attributes.
local variable. A symbol defined in one program
module or procedure that can only be used within that
program module or procedure.
M
management domain . A set of nodes that are
configured for management by Cluster Systems
Management. Such a domain has a management server
that is used to administer a number of managed nodes.
Only management servers have knowledge of the
domain. Managed nodes only know about the servers
managing them.
menu. A displayed list of items from which a user can
make a selection.
message catalog. An indexed table of messages. Two
or more catalogs can contain the same index values.
The index value in each table refers to a different
language version of the same message.
message passing. The process by which parallel tasks
explicitly exchange program data.
Message Passing Interface (MPI). A library
specification for message passing. MPI is a standard
application programming interface (API) that can be
used by parallel applications.
MIMD. multiple instruction stream, multiple data
stream.
multiple instruction stream, multiple data stream
(MIMD). A parallel programming model in which
different processors perform different instructions on
different sets of data.
MPMD. Multiple program, multiple data.
Multiple program, multiple data (MPMD). A parallel
programming model in which different, but related,
programs are run on different sets of data.
N
network. In data communication, a configuration in
which two or more locations are physically connected
for the purpose of exchanging data.
network information services (NIS). A set of network
services (for example, a distributed service for
retrieving information about the users, groups, network
addresses, and gateways in a network) that resolve
naming and addressing differences among computers
in a network.
NIS. See network information services.
node ID. A string of unique characters that identifies
the node on a network.
nonblocking operation. An operation, such as
sending or receiving a message, that returns
immediately whether or not the operation has
completed. For example, a nonblocking receive does
not wait until a message arrives. A nonblocking receive
must be completed by a later test or wait.
O
object code. Machine-executable instructions, usually
generated by a compiler from source code written in a
higher level language. Object code might itself be
executable or it might require linking with other object
code files.
optimization. The process of achieving improved
run-time performance or reduced code size of an
application. Optimization can be performed by a
compiler, by a preprocessor, or through hand tuning of
source code.
option flag. Arguments or any other additional
information that a user specifies with a program name.
Also referred to as parameters or command line
options.
P
package. 1) In AIX, a number of filesets that have
been collected into a single installable image of licensed
programs. See also fileset and licensed program. 2) In
Glossary 99
Linux, a collection of files, usually used to install a
piece of software. The equivalent AIX term is fileset.
parallelism. The degree to which parts of a program
may be concurrently executed.
parallelize. To convert a serial program for parallel
execution.
parameter. A value or reference passed to a function,
command, or program that serves as input or controls
actions. The value is supplied by a user or by another
program or process.
peer domain. A set of nodes configured for high
availability. Such a domain has no distinguished or
master node. All nodes are aware of all other nodes,
and administrative commands can be issued from any
node in the domain. All nodes also have a consistent
view of the domain membership. Contrast with
management domain.
point-to-point communication. A communication
operation that involves exactly two processes or tasks.
One process initiates the communication through a
send operation. The partner process issues a receive
operation to accept the data being sent.
procedure. In a programming language, a block, with
or without formal parameters, that is initiated by
means of a procedure call. (2) A set of related control
statements that cause one or more programs to be
performed.
process. A program or command that is actually
running the computer. A process consists of a loaded
version of the executable file, its data, its stack, and its
kernel data structures that represent the process’s state
within a multitasking environment. The executable file
contains the machine instructions (and any calls to
shared objects) that will be executed by the hardware.
A process can contain multiple threads of execution.
The process is created with a fork() system call and
ends using an exit() system call. Between fork and exit,
the process is known to the system by a unique process
identifier (PID).
Each process has its own virtual memory space and
cannot access another process’s memory directly.
Communication methods across processes include
pipes, sockets, shared memory, and message passing.
profiling. A performance analysis process that is
based on statistics for the resources that are used by a
program or application.
pthread. A shortened name for the i5/OS threads API
set that is based on a subset of the POSIX standard.
R
reduction operation. An operation, usually
mathematical, that reduces a collection of data by one
or more dimensions. For example, an operation that
reduces an array to a scalar value.
remote host. Any host on a network except the host at
which a particular operator is working.
remote shell (rsh). A variant of the remote login
(rlogin) command that invokes a command interpreter
on a remote UNIX machine and passes the
command-line arguments to the command interpreter,
omitting the login step completely.
RSCT peer domain. See peer domain.
S
secure shell (ssh). A Unix-based command interface
and protocol for securely accessing a remote computer.
shell script. A program, or script, that is interpreted
by the shell of an operating system.
segmentation fault. A system-detected error, usually
caused by a reference to a memory address that is not
valid.
server. A software program or a computer that
provides services to other software programs or other
computers.
single program, multiple data (SPMD). A parallel
programming model in which different processors run
the same program on different sets of data.
source code. A computer program in a format that is
readable by people. Source code is converted into
binary code that can be used by a computer.
source line. A line of source code.
SPMD. single program, multiple data.
standard error (STDERR). The output stream to
which error messages or diagnostic messages are sent.
standard input (STDIN). An input stream from which
data is retrieved. Standard input is normally associated
with the keyboard, but if redirection or piping is used,
the standard input can be a file or the output from a
command.
standard output (STDOUT). The output stream to
which data is directed. Standard output is normally
associated with the console, but if redirection or piping
is used, the standard output can be a file or the input
to a command.
STDERR. standard error.
100 IBM PE for AIX and Linux V5 R1: Installation
STDIN. standard input.
STDOUT. standard output.
subroutine. A sequence of instructions within a larger
program that performs a particular task. A subroutine
can be accessed repeatedly, can be used in more than
one program, and can be called at more than one point
in a program.
synchronization. The action of forcing certain points
in the execution sequences of two or more
asynchronous procedures to coincide in time.
system administrator. The person who controls and
manages a computer system.
T
task. In a parallel job, there are two or more
concurrent tasks working together through message
passing. Though it is common to allocate one task per
processor, the terms task and processor are not
interchangeable.
thread. A stream of computer instructions. In some
operating systems, a thread is the smallest unit of
operation in a process. Several threads can run
concurrently, performing different jobs.
trace. A record of the processing of a computer
program or transaction. The information collected from
a trace can be used to assess problems and
performance.
U
user. (1) An individual who uses license-enabled
software products. (2) Any individual, organization,
process, device, program, protocol, or system that uses
the services of a computing system.
User Space. A version of the message passing library
that is optimized for direct access to the high
performance switch (PE for AIX) or communication
adapter (PE for Linux). User Space maximizes
performance by not involving the kernel in sending or
receiving a message.
utility program. A computer program in general
support of computer processes; for example, a
diagnostic program, a trace program, a sort program.
utility routine. A routine in general support of the
processes of a computer; for example, an input routine.
V
variable. A representation of a changeable value.
X
X Window System. A software system, developed by
the Massachusetts Institute of Technology, that enables
the user of a display to concurrently use multiple
application programs through different windows of the
display. The application programs can execute on
different computers.
Glossary 101
102 IBM PE for AIX and Linux V5 R1: Installation
Index
Aabbreviated names xi
accessibility features for this product 91
acronyms for product names xi
administrator, additional information
for 63
/etc/poe.limits file 63
entries 63
how the Partition Manager
handles 64
/etc/poe.security file 64
configuring coschedulerparameters and limits 65
configuring coscheduler (PE for AIX
only) 65
AIX dispatcher tuning 68
Configuring InfiniBand for User Space
without LoadLeveler (PE for AIX
only) 72
compiling and installing NRT API
samples 72
enabling Remote Direct Memory
Accessfor IBM High Performance
Switch 70
for InfiniBand interconnect 71
enabling Remote Direct Memory
Access (PE for AIX only) 70
AFS installation 48
AIX operating system requirements 3
AIX-based security 14
API subroutine libraries, described 1
Ccluster-based security configuration 13
compatibility, LAPI and MPI libraries 11
components, PE 1
Ddeinstallation script
PE for AIX 76
PE for Linux 78
disk space requirementsAIX installation 6
Linux installation 10
pedocs product option 6
poe product option 6
distributions supportedLinux 7
documentation, online 58
Eenabling xinetd for Linux installation 12
errors, resolving for installation 36
Ffile systems 12
fileset requirementsAIX installation 3
filesets (PE for AIX), installing 20
copy software off distribution
medium 20
copy software to hard disk for
installation over network 20
determine remaining tasks 23, 26
export installation directory 21
if installation fails 23
install PE on other nodes 24, 26
installing manually 26
perform initial installation 21, 22, 23
using SMIT 22
using installation script 24
using installp command 22
verify the installation 26
Hhardware requirements
PE for AIX 3
PE for Linux 6
Iinstallation errors, resolving 36
installation procedure summary, PE for
AIX 20
installation requirementsPE for AIX 3
PE for Linux 6
installation scriptPE for AIX 75
PE for Linux 77
Installation Verification Program
(IVP) 81
PE for AIX installation 26
PE for Linux installation 35
installation-related tasks, performing 47
customizing the message catalog 48,
51
PE for AIX 47
installing AFS 48
recovering from a software vital
product database error 47
removing a software
component 47
setting up POE for AFS
execution 48
PE for Linux 49
finding installed components 49
removing a software
component 50
installing PE 17
enabling the barrier sychronization
register (BSR) 28
installing PE (continued)installing on multiple nodes using the
pe_install.sh script in batch
mode 35
installing PE and LAPI RPMs using
the pe_install.sh script 32
installing PE license RPM using the
pe_install.sh script 31
PE and LAPI RPMs 59
PE for AIX 17, 28, 32
filesets 20
PE for AIX on an IBM Power Systems
cluster 17
PE for AIX with CSM 17
PE for Linux 28, 31, 35, 36
post installation tasks 28, 35
resolving installation errors 36
understanding how installing PE for
AIX alters your system 53
online documentation 57
PDB fileset 57
POE fileset 53
understanding how installing PE for
Linux alters your system 58, 59
PDB fileset 57
PE license RPM 58
using the pe_install.sh script 31
verifying the installation 35
viewing the readme after
installation 36
IP buffer usage, when running large POE
jobs over AIX 16
Llimitations, PE 10
loadl.so (LoadLeveler) fileset, when to
install 19
Mmigrating and upgrading PE 39
PE for AIX 39
AIX compatibility 40
AIX support 40
barrier sychronization register
(BSR) support 41
coexistence 40
Fortran 90 compile time
type-checking support 41
LAPI support 41
migration support 40
MPI library support 41
online documentation 42
PE for Linux 42
coexistence 44
Fortran 90 compile time
type-checking support 45
installing an upgrade 43
installing fix upgrades 43
© Copyright IBM Corp. 1993, 2008 103
migrating and upgrading PE (continued)PE for Linux (continued)
installing PTF upgrades 43
LAPI support 45
migration support 45
overview 39
migration installation, PE for AIX 18
determining which earlier filesets are
installed 18
removing earlier filesets 18
Nnode resources 11
deciding which nodes require which
PE filesets or RPMs, or additional
software 12
Pparallel operating environment (POE),
described 1
PDB debugger, described 1
PE documentation, described 1
PE featurehow installation alters system 53, 58
pe_deinstall.sh deinstallation script 78
pe_install.sh installation script 77
PEdeinstall deinstallation script 76
pedocs filesethow installation alters system 57
pedocs product optiondisk space requirements 6
PEinstall installation script 75
copying the image 75
mounting the image 76
performance tuning, Linux system 15
planning to install the PE software 3
POE (parallel operating environment),
described 1
poe product optiondisk space requirements 6
POE sample applications, additional 83
bandwidth measurement test
sample 83
verification steps 83
broadcast test sample 84
verification steps 85
LAPI sample programs 87
MPI threads sample program 86
verification steps 86
POE security method configuration 13
port numbersPOE 89
port usagePE for AIX 89
PE for Linux 89
ppe.poe filesethow installation alters system 53
ppe.shell filesethow installation alters system 57
Rreadme file 19
requirementsPE for AIX installation 3
PE for Linux installation 6
resources, node 11
deciding which nodes require which
PE filesets or RPMs, or additional
software 12
RPM requirementsLinux installation 7
rsct.core.sec fileset, when to install 19
rsct.lapi.bsr fileset, when to install 19
rsct.lapi.rte fileset, when to install 19
Ssafe coding practices 81
securityAIX-based 14
cluster-based 13
POE-based 13
software requirementsadditional (PE for AIX) 4
additional (PE for Linux) 8
PE for AIX 3
PE for Linux 6
software, planning to install PE 3
system administrator, information for 10
system partitioning 11
Ttrademarks 95
tuning, Linux system 15
typographic conventions and
terminology x
Uupgrading AIX without upgrading
compilers 11
user authorizationPE for AIX 13
PE for Linux 14
user IDs on remote nodes 13
104 IBM PE for AIX and Linux V5 R1: Installation
Reader’s Comments– We’d like to hear from you
Parallel Environment for AIX and Linux
Installation
Version 5 Release 1
Publication No. SC23-6666-00
We appreciate your comments about this publication. Please comment on specific errors or omissions, accuracy,
organization, subject matter, or completeness of this book. The comments you send should pertain to only the
information in this manual or product and the way in which the information is presented.
For technical questions and information about products and prices, please contact your IBM branch office, your
IBM business partner, or your authorized remarketer.
When you send comments to IBM, you grant IBM a nonexclusive right to use or distribute your comments in any
way it believes appropriate without incurring any obligation to you. IBM or any other organizations will only use
the personal information that you supply to contact you about the issues that you state on this form.
Comments:
Thank you for your support.
Submit your comments using one of these channels:
v Send your comments to the address on the reverse side of this form.
v Send your comments via e-mail to: [email protected]
If you would like a response from IBM, please fill in the following information:
Name
Address
Company or Organization
Phone No. E-mail address
Readers’ Comments — We’d Like to Hear from You SC23-6666-00
SC23-6666-00
����
Cut or FoldAlong Line
Cut or FoldAlong Line
Fold and Tape Please do not staple Fold and Tape
Fold and Tape Please do not staple Fold and Tape
NO POSTAGENECESSARYIF MAILED IN THEUNITED STATES
BUSINESS REPLY MAIL FIRST-CLASS MAIL PERMIT NO. 40 ARMONK, NEW YORK
POSTAGE WILL BE PAID BY ADDRESSEE
IBM Corporation
Department 58HA, Mail Station P181
2455 South Road
Poughkeepsie NY
12601-5400
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
__
_
����
Program Number: 5765-PEA and 5765-PEL
SC23-6666-00