5.3 installation and troubleshooting overview. unit objectives after completing this unit, you...

66
5.3 Installation and troubleshooting overview

Upload: gerard-baker

Post on 17-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

5.3

Installation and troubleshooting overview

Page 2: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

2

Unit objectives

After completing this unit, you should be able to:

• Identify the BladeCenter components used to provide PD information

• List the planning elements required for the BladeCenter management network

• Select the functions available to modify firmware settings

• List the blade server indicators and Light Path Components

• Select the steps appropriate in diagnosing blade server hardware failures

• Identify the utility to use in displaying BladeCenter component health

Page 3: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

3

Best practices

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 4: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

4

BladeCenter chassis questions: Requirements

• Given your specific needs, what is the best BladeCenter solution (in terms of components) necessary to meet your requirements?

• Define the networking and SAN requirements for your BladeCenter environment based on your existing infrastructure, including fault tolerance, throughput and interoperability.

• Do you plan on having a separate Management LAN and production LAN? What is the advantage/disadvantage of this environment?

• Are all of the components being installed in the BladeCenter chassis on the ServerProven list?

• Is this BladeCenter chassis to be deployed locally or in a remote location?

Page 5: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

5

Blade server considerations: Questions

• Is the blade server at the latest firmware level? If not, what method of applying the latest firmware updates are you going to implement?

• Besides the BIOS, what other firmware updates are needed for the blade server?

• What operating system are you going to put on the blade server. How do I find out if this OS is supported on the blade server?

• What are the different deployment methods for operating system installations, and which method makes the most sense in my environment?

• What performance requirements are needed out of my blade server? Based upon these requirements, which model best fits my business needs?

Page 6: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

6

BladeCenter chassis questions: Power

• Do you understand the necessary power requirements for a given BladeCenter solution?

• Will your BladeCenter chassis be connected to either a front-end or high-density front-end rack PDU?

• How many blade servers are in the chassis and will that impact oversubscription of the power domains?

• Do you have the correct electrical connectors to power your new BladeCenters and their PDUs?

Page 7: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

7

Cooling questions

• Are the systems on a raised floor?

• How many BTUs am I generating when my installation is complete?

• What are the power requirements for the new systems?

• Are there plans to grow in the future?

Page 8: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

8

Troubleshooting and problem determination

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 9: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

9

Problem determination: Information gathering

• Due to the variety of hardware and software combinations that can be encountered, use the following information to assist you in problem determination. If possible, have this information available when requesting assistance from Service Support and Engineering functions. – Machine type and model – Microprocessor or hard disk upgrades– Failure symptom

• Do diagnostics fail? • What, when, where, single, or multiple systems?• Is the failure repeatable?• Has this configuration ever worked?• If it has been working, what changes were made prior to it failing?• Is this the original reported failure?

– Diagnostics version — type and version level– Hardware configuration

• Print (print screen) configuration currently in use• BIOS level

– Operating system software — type and version level

Page 10: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

10

Blade servers: Diagnostics tools•Light Path Diagnostics

•Standalone diagnostics

•Diagnostics by PC Doctor– Test results are stored in a test log– Management Module event logs contain system status messages from the

blade server service processor and can be:• Viewed• Saved to diskette• Printed• Attached to e-mail alerts

– Standard log is a summary of tests– Press <Tab> while viewing the test log

•Power On Self Test (POST) beep codes

•Unified Extensible Firmware Interface (UEFI)

– Elimination of Beep Codes

– Advanced logging and firmware control

•Command-line interface (CLI)

Page 11: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

11

IBM Blade Server: Front panel LEDs HS22 example

IBM HS22 Blade Server Front Panel indicators and controls

HS22 Blade Server Front Panel

Page 12: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

12

IBM Blade Server: System board diagnostic indicators HS22 example

IBM Blade Server HS22 System Board Indicators

HS22 System Board Light Path Panel

• IBM HS22 Blade server system board example– Memory, processor, and disk Indicators– Light Path Panel

Page 13: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

13

LS22 Blade Server Front Panel Controls and Indicators

IBM LS22 Blade Server Front Panel

IBM Blade Server: Front panel LEDs LS22 example

Page 14: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

14

LS22 Blade Server System Board Light Path Panel

IBM LS22 Blade Server System Board

IBM Blade Server: System board diagnostic indicators LS22 example

Page 15: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

15

IBM Blade Server: Diagnostics tools

• Light Path Diagnostics• Press F2 at POST to invoke standalone diagnostics• Diagnostics by PC Doctor

– Test results are stored in a test log– Management Module event logs contain system status messages

from the blade server service processor and can be:• Viewed• Saved to diskette• Printed• Attached to e-mail alerts

– Standard log is a summary of tests– Press <Tab> while viewing the test log

• Power On Self Test (POST) beep codes • Real time diagnostics• Command-line interface (CLI)

Page 16: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

16

Blade server: Basic input/output system (BIOS)

• Blade server BIOS– Menu-driven setup– Settings for configuration and performance– Set, change, delete (IRQ, date and time, and Passwords)– Advanced settings for specific needs (for example, memory, CPU,

PCI bus and BMC)– BIOS defaults

• Flash diskette• BIOS updates for host and devices CD-ROM BIOS/firmware

updates and configuration for host and devices• BIOS system board jumpers or switches

– BIOS boot selection– Password override– Wake on LAN enablement

Page 17: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

17

UEFI: Unified Extensible Firmware Interface (1 of 3)

• The next generation of BIOS• Allows OSs to take full advantage of the hardware

– Architecture independent– Modular

• 64-bit code architecture • 16 TB of memory can be addressed

• More functionality– Adapter vendors can add more features in their options (for example, IPv6)– Design allows faster updates as new features are introduced– More adaptors can be installed and used simultaneously– Fully backwards compatible with legacy BIOS

• Better user interface– Replaces ctrl key sequences with a more intuitive human interface– Moves adaptor and iSCSI configuration into F1 setup

– Creates human readable event logs • Easier management

– Eliminates “beep” codes; all errors can now be covered by Light Path– Reduces the number of error messages and eliminates out-dated errors– Can be managed both in-band and out of band

Page 18: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

18

UEFI: Unified Extensible Firmware Interface (2 of 3)

BMC

RSAII

Diags BIOS

xFlash

ASU

Configuration

Update &

IMMPbDSA

UEFI

Configuration

Update &xFlash

ASU

Today’s update and configuration on systems

Tomorrow’s update and configuration on systems

Page 19: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

19

UEFI: Unified Extensible Firmware Interface (3 of 3)

UEFI BIOS64 bit code architecture: 16 TB of memory can be addressed

16 bit code architecture: Only 1MB of memory can be addressed.

Eliminates Code Space Constraints. Adapter Option ROMs can be loaded anywhere in memory with no size restrictions.

Adapter Vendors must fit all option code into a shared 128K. Limits the number of adapters that can be effectively installed.

Adapter vendors are free to add function. i.e. IPV6 Vendors are limited in the function they can provide in the option ROM.

UEFI defines a Human Interface that is being extended to Adapter Vendors.

Cryptic Ctrl Key sequences required for configuring Adapters.

iSCSI Configuration is in F1 Setup and consolidated in to ASU.

iSCSI Configuration requires separate tool.

Elimination of Beep Codes – All Errors covered by Light Path. Reduction in Number of Error Messages.

Multiple Beep Codes for fundamental failures.

Adapter Configuration can move into F1 Setup. Eliminates Ctrl Key sequences for configuring Adapters.

Advanced Settings Utility (ASU) has partial coverage of F1 Settings

In & Out of Band UEFI Updates. Settings accessed Out of Band via ASU and the IMM.

In-Band only updates via DOS, wFlash, or lFlash.

UEFI Event codes available out of band. Human readable Event logs in F1 Setup

Numerous Legacy POST Errors.

UEFI versus BIOS

Page 20: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

20

Blade server: Integrated Management Module (IMM)

• Integrated Management Module (IMM)– Replacement for BMC– LAN over USB – OS drivers included in Windows and Linux

Page 21: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

21

Blade server six system states

  System State Data Gathering Data Analysis

1 There is no AC Visual PDSG

2 There is AC power but no DC Advanced Management Module (AMM) & (IMM)Light Path

System event log

3 There is AC and DC power but the system fails to complete post

Checkpoint codesF1 and F2 Beep codes (prior to UEFI)Adapter BIOS messages

PDSGRetain tipsIBM Support Web site

4There is AC and DC power, the system completes POST but the NOS fails to start loading

F2 diagnostics PDSGRetain tips

5There is AC and DC power, the system completes POST but the NOS fails to complete loading

NOS boot messages'Blue Screen''Safe' mode

NOS Vendor messages

6There is AC and DC power, the system completes POST and the NOS completes loading but stops during operation

DSANOS event logs

DSA

AC

AC/DC

POST

NOS

Start

Complete

Stop

Page 22: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

22

Advanced Management Modules (AMM): Overview

Video Connector

Power-on LEDS

Serial Console Connector RJ45

10/100 Ethernet Connector RJ45

USB Dual Stack

Pin-hole Reset

Release handle

MAC Address

Activity LEDS Error LEDS

Port Link LED

Port Activity LED

Advanced Management Module LEDS

• The Management Module stores all event and error information for the BladeCenter • The Management Module configuration data is stored both in itself and on the midplane

– To reset the IP address back to the default settings, press and hold the IP reset button for 3 seconds or less

Page 23: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

23

Recovering Management Module TCP/IP address

• MM configuration data is stored in the midplane– To reset a TCP/IP address only:

• Remove the cable from the MM Ethernet port• Press and hold the IP reset button for 3 seconds or less

– TCP/IP address will reset to 192.168.70.125/255.255.255.0

– Simply replacing the MM will cause the replacement MM to adopt the same values as the original MM• PERFORM ALL RESET STEPS BEFORE REPLACING THE MM

Page 24: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

24

Management Module full reset: Factory defaults

• MM configuration data is stored in the midplane – To force a complete MM reset (including password):

• Remove the cable from the MM ethernet port• Press and hold the IP reset button for 5 seconds• Release the IP reset button for 5 seconds• Press and hold the IP reset button for 10 seconds

– TCP/IP address will be reset to 192.168.70.125/255.255.255.0– All IDs and passwords will be deleted (except USERID/PASSW0RD)

– Simply replacing the MM will cause the replacement MM to adopt the same values as the original MM• PERFORM ALL RESET STEPS BEFOIRE REPLACING THE MM

Page 25: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

25

Advanced management event log

Page 26: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

26

Problem determination: Blade server example

• Example of a memory DIMM problem– Display of BladeCenter Front Panel LEDs

Management Module web interface indicating error LEDs

Page 27: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

27

Problem determination: Blade server example

• Example of a memory DIMM problem– Display of the Blade server front panel LEDs

Advanced Management Module Blade server LEDs

Page 28: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

28

Problem determination: Blade server example

• Example of a memory DIMM problem– Display of the BladeCenter Event Log

Advanced Management Module Event Log

Page 29: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

29

Problem determination: Blade server example

• Using the IBM Problem Determination guide - IBM BladeCenter HS21 – Locate the error symptom code in the log (in this example: 289)– Match the table entry to the code

Check POST error log for error message 289:

Page 30: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

30

Problem determination: Blade server example

• Consult the IBM Installation Guide for the HS21– Proper DIMM installation procedure

HS21 DIMM Installation slot and order

Page 31: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

31

Problem determination: Blade server example

• Verifying fix and proper operation

AMM Status Display and Event Log

Page 32: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

32

Problem determination: Blade servers

• What do you do if:

– Blade server powered down for no apparent reason

– Blade server does not power on, the system-error LED on the BladeCenter system-LED panel is lit, the blade error LED on the blade server LED panel is lit, and the system-error log contains the following message: ″CPUs Mismatched″

– Some components do not report environmental status (temperature, voltage)

– Switching KVM control between blade servers gives USB device error

Page 33: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

33

Ethernet switch modules: Addressing issues

• What do you do if:– You have duplicate IP address reported on the ESM– You have duplicate IP address reported on the blade server– You have a native VLAN mismatch reported on the ESM– There are connection problems to the blade servers– The DHCP server uses up all IP addresses and the blade server

still cannot get an address

Page 34: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

34

Problem determination: Ethernet switch I/O modules

• Hardware failures• Not very common

– On MM, look under I/O Module Tasks -> Power/Restart to see diagnostic code after reboot. Also look at fault LED on the Ethernet Switch Module

• Software Failures– Not very common– As with all products, software bugs do exist– Reference the latest code readme file for a list of

resolved bugs with each release of code

• Misconfiguration of Ethernet Switch Module or other component

– This is the most common issue encountered– Often requires close cooperation between different

administrative groups to resolve

Page 35: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

35

Ethernet switch modules: Configuration issues

• Most common issue encountered– May be with the Ethernet Switch Module, a device upstream or the

server within the BladeCenter– May also be misconfiguration on the Management Module

• Same tools used to troubleshoot configuration issues can also be used to help isolate broken hardware and software bugs

• Usually requires close cooperation between network administrators and server administrators

• Often helps to have special tools (for example, network sniffer) to understand and resolve problem

Page 36: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

36

Ethernet switch modules: Basic rules

• Do not attach cables to the ESM until both sides of the connection are configured

• Do not put the blade servers on the VLAN that the ESM uses for its management VLAN interface

• Make sure the ESM firmware (IOS) code is upgraded

• Decide the ESM management path (via Management Module or ESM uplinks) and configure for it

Page 37: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

37

BladeCenter management interfaces

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 38: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

38

BladeCenter AMM: System status screen

Navigation menu

Main information window

Page 39: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

39

System Event Log (SEL) screen

• This screen shows event history of the BladeCenter

Page 40: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

40

Hardware Vital Product Data (VPD)

• This screen shows information relating to the hardware in the BladeCenter

Page 41: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

41

Rules for I/O module management

• In-band management– Use the AMM path to an I/O module

• Provides centralized management of all I/O modules– All activities and reporting is through a single Ethernet port– Makes LAN configuration easier

• Requires MM and all I/O modules to be on the same IP subnet

• Out-of-band management– Requires enablement of external management over all ports

• May require management VLAN configuration• Access will involve many Ethernet ports• I/O module need not be on the same IP subnet as the MM

– If subnets are different, AMM path to I/O module is unavailable

Page 42: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

42

I/O module tasks: Close up

Page 43: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

43

I/O module tasks: Advanced switch management

Page 44: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

44

Ethernet switch I/O module Web interface

Page 45: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

45

CIGESM Web interface

Page 46: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

46

Nortel ESM Web interface

Page 47: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

47

Fibre Channel switch module Web interface

• SAN Utility (QLogic)– Full Function GUI

• SAN Browser (Qlogic)– Limited functionality

• Switch Explorer (Brocade)– Limited functionality

Page 48: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

48

Firmware updates and settings

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 49: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

49

UpdateXpress CD-ROM package

• UpdateXpress – Bootable CD-ROM

– Supports maintenance of system firmware and Windows device drivers• Automatically detects current device-driver and firmware levels • Gives the option of selecting specific upgrades or allowing UpdateXpress to

update all of the system levels it detected as needing upgrades• Can be installed using local DVD or over network using the AMM

Page 50: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

50

UpdateXpress firmware update scripts

• UpdateXpress Firmware Update Scripts for BladeCenter (UXBC)– Process that enables firmware updates to be run in a remote,

unattended fashion• Requires a management station and supporting software

– Windows or Linux OS– FTP and TFTP servers somewhere on the management LAN– UXBC discovery and deployment components

– For more information, see – http://www-03.ibm.com/systems/management/uxs.html

Page 51: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

51

IBM preboot dynamic system analysis

• Provides problem isolation, configuration analysis, error log collection– Collects information about:

• System configuration• Network interfaces and settings• Installed hardware• Light path diagnostics status• Service processor status and

configuration• Vital product data, firmware,

and UEFI configuration• Hard disk drive health

Page 52: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

52

Advanced settings utility

• Enables the user to modify firmware settings from the command line– Supported on multiple operating system platforms– Enables remote changes to POST and BIOS settings

• Does not require F1 access to a console session

– Supports scripting through a batch processing mode– Does not update any of the firmware code– For more information, see – http://www-304.ibm.com/systems/support/supportsite.wss/

docdisplay?brandind=5000008&lndocid=MIGR-55021

Page 53: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

53

Information gathering

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 54: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

54

Data gathering

• Read the BladeCenter data collection guide– Contains details of what logs and information are needed for

escalations– Contains a step-by-step guide on how the logs are collected – For more information, see– http://www-304.ibm.com/systems/support/supportsite.wss/

docdisplay?lndocid=SERV-BLADE&brandind=5000008

Page 55: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

55

Gathering information from blade servers

• Blade server logs can be gathered within the operating system– Use the following table to determine what utility to use

Type of blade server Operating system Type of gathering utility:

HS Series Windows Dynamic System Analysis

HS Series Linux Dynamic System Analysis

LS Series Windows Dynamic System Analysis

LS Series Linux Dynamic System Analyses

JS Series Linux SNAP

JS Series AIX SNAP

SNAP is built into AIX and SNAP for Linux on Power can be found at: http://techsupport.services.ibm.com/server/lopdiags.

Page 56: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

56

Gathering information from I/O switch modules

• Logs from a Brocade, Cisco, BNT or QLogic switch module can be captured within the switch interface– Enable capture text/console logging within the telnet application– Login to the switch using telnet– Issue the command from the table below

Type of switch: Command:

Brocade showSupport

Cisco show tech-support

Nortel maint/tsdmp

Qlogic support show

Page 57: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

57

IBM BladeCenter support resources

• Best practices

• Troubleshooting and problem determination

• BladeCenter management interfaces

• Firmware updates and settings

• Information gathering

• IBM BladeCenter support resources

Page 58: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

58

IBM support Web site

• New central Web site for all server products:http://www-304.ibm.com/systems/support/ – Select BladeCenter from the drop-down menu

Page 59: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

59

Documentation

• Hardware Maintenance Manual– Available electronically (Adobe Acrobat .PDF format) from the IBM

support Web site• Primary support document for diagnostics and troubleshooting

• User’s Guide, Installation Guide– System documentation that ships with the BladeCenter and with

options such as blade servers and switch modules• Useful for confirming shipping group contents (missing parts, and so on)

and initial customer setup

Page 60: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

60

IBM Blade Server references

• IBM BladeCenter Products and Technology – http://www.redbooks.ibm.com/cgi-bin/searchsite.cgi?query=bladecenter

• IBM ServerProven – Compatibility for BladeCenter Products– http://www-03.ibm.com/servers/eserver/serverproven/compat/us/

• System x Reference (xREF)– http://www.redbooks.ibm.com/xref/usxref.pdf

• Intel Products – http://www.intel.com/products/server/processors/index.htm

• AMD Products

– http://www.amd.com/us/products/server/Pages/server.aspx

Page 61: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

61

Key words• Advanced Management Module (AMM)• Alternating Current (AC)• Basic Input/Output System (BIOS)• British thermal unit (BTU)• Central Processing Unit (CPU)• Cisco Intelligent Gigabit Ethernet Switch Module

(CIGESM) • Command-line interface (CLI)• Compact Disc Read-Only Memory (CD-ROM)• Dynamic Host Configuration Protocol (DHCP)• Ethernet switch modules (ESM) • Fibre Channel Switch Module (FSCM)• File Transfer Protocol (FTP) • Graphical User Interface (GUI)• IBM BladeCenter E (Enterprise)• IBM BladeCenter H (High Performance)• IBM BladeCenter HT (High Performance Telco)• IBM BladeCenter S (Simplification)• IBM BladeCenter T (Telco)• Integrated Management Module (IMM)• Input-output (I/O)• Internet Protocol (IP)• Interrupt Request (IRQ)• Jumper (J)• Keyboard, Video, and Mouse (KVM)

• Local-Area Network (LAN)• Management Module (MM) • Non-Maskable Interrupt (NMI)• Operating System (OS)• Peripheral Component Interconnect (PCI)• Power Distribution Unit (PDU)• Power On Self Test (POST)• Remote Supervisor Adapter II (RSA II) • Secure Sockets Layer (SSL)• Serial over LAN (SoL) • Servcie Pack (SP)• Service Support Representative ( SSR ) • Simple Mail Transfer Protocol (SMTP)• Simple Network Management Protocol (SNMP)• Storage Area Network (SAN) • System Event Log (SEL)• Transmission Control Protocol (TCP)• Trivial File Transfer Protocol (TFTP) • Unified Extensible Firmware Interface (UEFI) • UpdateXpress Firmware Update Scripts for BladeCenter

(UXBC)• Virtual Local Area Network (VLAN)• Vital Product Data (VPD)• Volt (V)• Watt (W)

Page 62: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

62

Checkpoint (1 of 2)

1. The _______________________ stores all major event and error information for the BladeCenter and is the starting point for PD.

a. Ethernet Switch Module (ESM)

b. AMM

c. BIOS

d. Blade Server operating system log

2. True/False: In planning the BladeCenter management network, bandwidth is the primary consideration.

3. The __________ enables the user to modify firmware settings from the command line.

4. True/False: While AMM management can be done through a Web interface, all switch modules must be configured using command line.

Page 63: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

63

Checkpoint solutions (1 of 2)

1. The _______________________ stores all major event and error information for the BladeCenter and is the starting point for PD.

a. Ethernet Switch Module (ESM)b. AMMc. BIOSd. Blade Server operating system log

Answer: b

2. True/False: In planning the BladeCenter management network, bandwidth is the primary consideration.

Answer: False

3. The __________ enables the user to modify firmware settings from the command line.

Answer: Advanced Settings Utility (ASU)

4. True/False: While AMM management can be done through a Web interface, all switch modules must be configured using command line.

Answer: False

Page 64: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

64

Checkpoint (2 of 2)

5. Select the correct statement regarding Blade Server status indicators.

a. Memory and processor LEDs are on the Blade Server front panelb. All Blade Server status LEDs are on the Light Path diagnostics panelc. Blade Server status and error LEDs are on the Front Panel, Control Panel

and adjacent to components on the system boardd. Light Path status and error indicators require the Blade to be powered on

6. True/False: The UEFI is a functional replacement for legacy BIOS7. True/False: To diagnose a Blade Server hardware problem, the first

step to take would be to remove the Blade from the chassis and check the system board LEDs.

8. True/False: As a rule, power consumption is directly related to resultant heat output.

9. Which function should be used to view Service Processor configuration and hard disk drive health?

a. AMM Event Logb. PreBoot DSAc. AMM Monitor status page

Page 65: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

65

Checkpoint solutions (2 of 2)

5. Select the correct statement regarding Blade Server status indicators.a. Memory and processor LEDs are on the Blade Server front panelb. All Blade Server status LEDs are on the Light Path diagnostics panelc. Blade Server status and error LEDs are on the Front Panel, Control Panel and

adjacent to components on the system boardd. Light Path status and error indicators require the Blade to be powered on

Answer: c6. True/False: The UEFI is a functional replacement for legacy BIOS

Answer: True7. True/False: To diagnose a Blade Server hardware problem, the first step to take

would be to remove the Blade from the chassis and check the system board LEDs.

Answer: False8. True/False: As a rule, power consumption is directly related to resultant heat

output. Answer: True

9. Which function should be used to view Service Processor configuration and hard disk drive health?

a. AMM Event Logb. PreBoot DSAc. AMM Monitor status page

Answer: b

Page 66: 5.3 Installation and troubleshooting overview. Unit objectives After completing this unit, you should be able to: Identify the BladeCenter components

66

Unit summary

Having completed this unit, you should be able to:

• Identify the BladeCenter components used to provide PD information

• List the planning elements required for the BladeCenter management network

• Select the functions available to modify firmware settings

• List the blade server indicators and Light Path Components

• Select the steps appropriate in diagnosing blade server hardware failures

• Identify the utility to use in displaying BladeCenter component health