availability manager v3.0-2 overview

36
© 2009 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Availability Manager V3.0- 2 Overview Barry Kierstein Hewlett-Packard

Upload: lacey-blackburn

Post on 31-Dec-2015

40 views

Category:

Documents


3 download

DESCRIPTION

Availability Manager V3.0-2 Overview. Barry Kierstein Hewlett-Packard. Overview of This Session:. Availability Manager Overview Availability Manager Components Availability Manager Installation Availability Manager Configuration Availability Manager New Features for V3.0-2 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Availability Manager V3.0-2 Overview

© 2009 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice

Availability Manager V3.0-2 Overview

Barry Kierstein

Hewlett-Packard

Page 2: Availability Manager V3.0-2 Overview

Overview of This Session: • Availability Manager Overview• Availability Manager Components• Availability Manager Installation• Availability Manager Configuration• Availability Manager New Features for V3.0-2• Availability Manager Gotcha Items• Availability Manager Unsupported Features• Availability Manager Data Collection

Considerations• Availability Manager Live Demonstration

Page 3: Availability Manager V3.0-2 Overview

Availability Manager Overview – Let’s get started!

Page 4: Availability Manager V3.0-2 Overview

Availability Manager Overview• Real-time display of system(s) being monitored;

similar to MONITOR but with additional capabilities• Error and Information display: issues warnings

when resources are low• Can be used to “fix” various problems• Display portion is easy to learn (point and click)• Default setup is a good place to start finding

performance bottlenecks and resource contentions• With some customization, the Availability Manager

helps pinpoint problems specific to the systems being monitored

Page 5: Availability Manager V3.0-2 Overview

Availability Manager Overview• Collects data on one or more nodes (systems),

analyzes the data, displays it, and issues warnings

• For Alpha and VAX, requires only an OpenVMS license to collect data

• For I64, requires the EOE platform, or Avail_Man PAK to collect data

• Separate installation kit:−Included on the CD-ROM distribution kit−Download is available from the OpenVMS homepage

• Manuals included in hardcopy documentation kit, distribution kit, and on-line documentation kit

Page 6: Availability Manager V3.0-2 Overview

Availability Manager Groups• Systems can be grouped together for analysis• All members of an OpenVMS Cluster must

have the same group name for correct clusterwide data collections

• Unclustered systems can be put into the same group

• Availability Manager can be configured to display information only from specified groups, reducing the number of systems being monitored

• Also knows as AMDS groups

Page 7: Availability Manager V3.0-2 Overview

Availability Manager Components• Three parts:

−Data Collector gathers data on system(s) being monitored

−Data Analyzer collects data from the Data Collectors and displays the data

−Data Server allows Data Analyzers to collect data over an IP-based wide area network (WAN)

Page 8: Availability Manager V3.0-2 Overview

Availability Manager Components• Data Analyzer:

−A Java-based application

−Runs on OpenVMS Alpha platform from V7.3-2 and later with Motif or X-Windows

−Runs on OpenVMS I64 platform from V8.2-1 and later with Motif or X-Windows

−Runs on Intel platforms under Windows 2000 and Windows XP Professional

Page 9: Availability Manager V3.0-2 Overview

Availability Manager ComponentsSystem Overview window

Page 10: Availability Manager V3.0-2 Overview

Availability Manager Components• Data Collector:

−Consists of an OpenVMS device driver, configuration and startup files• Device driver file is RMDRIVER.EXE on VAX,

SYS$RMDRIVER.EXE on Alpha and I64• Device shows up with $ SHOW DEVICE RMA0: command

−Runs on Itanium, Alpha and VAX platforms−Runs on OpenVMS V6.2 and later−Sends out a Hello multicast message to

announce an OpenVMS system to the Data Analyzer

−Only collects data when a Data Analyzer sends a data collection request, uses little CPU time

Page 11: Availability Manager V3.0-2 Overview

Availability Manager Components• Data Server

−The Availability Manager uses its own protocol (AMDS protocol) for communication between the Data Analyzer and each Data Collector• Connection not dependent on network software to work

(IP, DECnet, LAT, etc.)

• Data collection and fixes often work even when the network on the system is not functioning or the system is hung

Page 12: Availability Manager V3.0-2 Overview

Availability Manager Components• Data Server

−Data Server allows a Data Analyzer to collect data over an IP-based network• Data Server resides on the same extended LAN as the

OpenVMS systems so it can communicate to the Data Collectors using the AMDS protocol

• Data Analyzer connects to the Data Server using an IP-based secure socket connection over a WAN or LAN

−A Data Server can accept connections from several Data Analyzers

−A Data Analyzer can connect to several Data Servers• For redundancy, one could have two Data Servers on the

same LAN

Page 13: Availability Manager V3.0-2 Overview

Availability Manager Installation• Availability Manager kits

−OpenVMS Data Collector kit and manifest for secure delivery• Contains files for each OpenVMS version and platform• Can use SYSMAN> DO command to install on a cluster• If updating the Data Collector, a system reboot is

necessary to remove the old Data Collector

−OpenVMS Data Analyzer/Server kit and manifest• Contains files for both the Data Analyzer and Data Server• System reboot is not necessary with this kit

−Windows 2000/XP kit• Normal Windows installation, requires a reboot to install a

driver• Install using Administrator account or equivalent

Page 14: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Collector configuration

−Data Collector password• In file SYS$MANAGER:AMDS$DRIVER_ACCESS.DAT

• Authentication between Data Analyzer and Data Collector

• A Data Collector can have several passwords allowing for various access rights and scopes

• Considerations for passwords

−Access rights – Read, Write and Control

−Scope for a particular password

• OpenVMS – password for all OpenVMS systems

• AMDS group – common for clusters

• Individual node

Page 15: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Collector configuration

−Data Collector settings• In file SYS$MANAGER:AMDS$LOGICALS.COM

• AMDS$GROUP_NAME – set as desired, one per cluster

• AMDS$DEVICE – Network adapter used for communications using the AMDS protocol

−Data Analyzer connections to Data Collectors

−Data Server connections to Data Collectors

−Note: Data Analyzer to Data Server connections use the IP protocol. The network adapters used are controlled by the IP stack on the particular system.

Page 16: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Collector configuration

−Data Collector settings• Hello multicast message settings

−AMDS$RM_DEFAULT_INTERVAL – Broadcast interval in seconds for Hello multicast messages when the system is not being monitored

−AMDS$RM_SECONDARY_INTERVAL – Broadcast interval in seconds for Hello multicast messages when the system is being monitored

−Determines how quickly the Data Analyzer discovers all the systems. For instance, if the secondary interval is 20 seconds for each system on a LAN, then it will take up to 20 seconds for all the systems on the LAN to be discovered.

−Each message is one packet of around 200 bytes, contributes little to the network traffic

Page 17: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Collector configuration

−Data Collector startup• SYS$STARTUP:AMDS$STARTUP is used to start and stop

the Data Collector, P1 is the function−START – Loads the configuration data and passwords,

and starts the Data Collector. Put this in command in SYS$MANAGER:SYSTARTUP_VMS.COM after the network stacks have been started so the MAC addresses of the network adapters have their final value

−STOP – Stops the Data Collector−RESTART – Stops and restarts the data collector. This is

useful if you change the configuration data or passwords, and want the changes loaded into the Data Collector

−STATUS – Current status of the Data Collector−HELP – List of possible functions

Page 18: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Server configuration

−Authentication between the Data Analyzer and the Data Server is by Kerberos public and private keys

−Create key pair on Data Server system• Start Data Analyzer, create keys, export public key

−Create key pair on Data Analyzer system• Start Data Analyzer, create keys, copy to Data Server

system

−Covered in Chapter 2 of the Availability Manager Users Guide

Page 19: Availability Manager V3.0-2 Overview

Availability Manager Configuration• Data Analyzer configuration

−Import any Data Server public keys• Start Data Analyzer

• Import keys in Network Connection dialog box

−Input Data Collector passwords• Use the Security tab in the Customization dialog box

• Passwords can be entered at the appropriate level

−OpenVMS level – Customize in System Overview

−AMDS group level – Right-click on group in System Overview

−Node level – Customize in Node pane or right-click on a node

Page 20: Availability Manager V3.0-2 Overview

Availability Manager Startup• The first window to appear is the System

Overview Window• Event data also goes to the event log file AnalyzerEvents.LOG

• On OpenVMS, you can set the location of the event log file with logical names

Page 21: Availability Manager V3.0-2 Overview

System Overview Window• Initially the System Overview window is empty.

Systems are displayed as the Hello multicast message is received from each Data Collector

• Shows all the systems being monitored in one window

• Information includes the Name, Utilization, O.S. and Hardware versions

• Allows customizations at the application and operating system levels

• Shows the connection used to gather the data (network adapter, connections to Data Servers)

Page 22: Availability Manager V3.0-2 Overview

Availability Manager Network Connection dialog box

Page 23: Availability Manager V3.0-2 Overview

Availability Manager System Overview Window

Page 24: Availability Manager V3.0-2 Overview

Availability Manager Data Server Statistics

Page 25: Availability Manager V3.0-2 Overview

Availability Manager Read from Data Server Statistics

Page 26: Availability Manager V3.0-2 Overview

Availability Manager New Features• Data Collection over IP

−Data Server to tunnel AMDS protocol over IP

−Avail_Man_Ana kit renamed to Avail_Man_Ana_Srvr to show that the Data Analyzer and Data Server reside in the same kit – must remove old kit due to name change

• Java 5.0 JVM used by Availability Manager−Increased performance on OpenVMS

−Requires ODS-5 disk – use /DESTINATION qualifier when installing the Data Analyzer/Server kit to direct the installation on an ODS-5 disk

−AMDS$AM_DISABLE_OFFSCREEN_PIXMAP_SUPPORT logical can help remote X-window display performance

Page 27: Availability Manager V3.0-2 Overview

Availability Manager New Features• System Overview window changes

−New and revised columns• PFLTS shows total and hard page fault rates

• PFW/COM shows number of processes in PFW and COM states

• DC shows Data Collector capability version and Managed Object registration state

• CPU Qs revamped to have more consequential states

−PFW and COM removed, leaving COMO, MWAIT, COLPG & FPW

−If total is non-zero, show all counts as n/n/n/n (see tooltip)

−Events have changed to reflect PFW and COM removal

−Memory tooltip shows memory and alignment fault info• Added HIALNR event for high alignment fault rate

Page 28: Availability Manager V3.0-2 Overview

Availability Manager New Features• Data Collection for Logical Disks (LDcn:)

devices• Event Log enhancements

−Each data connection has its own log file

−Status column – shows when a threshold event begins, ends, is cancelled or expires

−EventKey – unique key for an event on a node• For instance, all HICOMQ events for node APPLE

• Can use $ SEARCH to easily find all occurences of an event

−EventID – unique key for a single event• Easily find the BEGIN and END/CANCELED/EXPIRED record

for an event

Page 29: Availability Manager V3.0-2 Overview

Availability Manager New Features• Fixes

−Force a disk volumn out of Mount Verify state

−Force a shadow set member out of a shadow set that is in Mount Verify state

• Data Analyzer supports MAC address changes−CFGDON and PTHLST show MAC address used

−CHGMAC and NEWMAC events show address changes

• SYS$STARTUP:AMDS$STARTUP.COM−STATUS parameter shows RMA0: status

−START and RESTART have LOG qualifier to output configuration data sent to RMA0:

Page 30: Availability Manager V3.0-2 Overview

Availability Manager Gotcha Items• Make sure the most recent AMNDIS50.SYS

file on Windows systems is installed−Correct date is Nov 28, 2006

−Driver from earlier date can crash system when a second Data Analyzer is started

• OpenVMS Data Analyzer/Server V3.0-2 kit requires Data Collector V3.0-2A kit−A check for required logical names is done when

the Data Analyzer or Data Server is started. The logical names are defined in AMDS$STARTUP.COM, which is in the Data Collector kit.

Page 31: Availability Manager V3.0-2 Overview

Availability Manager Unsupported Features• Installation on Windows Vista

−Right-click -> Properties to install under compatibility mode

−More testing and compatibility knowledge needed to put on the supported list

• Running the Data Analyzer on other Oses−Work done to allow the Data Analyzer to connect to

Data Servers by using the JVM only, tested on Linux

−Install JVM on system

−Copy *.JAR and *.ZIP files into a subdirectory

−Create script with JAVA command line from AMDS$AM_RUN.COM

Page 32: Availability Manager V3.0-2 Overview

Availability Manager Data Collection Considerations• Data Collections on a local LAN typically finished

quickly - in less than a second or two for the largest data collections with many continuations

• Using a Data Server slows down the round trip time for data collections with many continuations−Affects systems with many processes, disks or large

resource hash table

−DCSLOW events are signaled when the data collection takes longer than the data collection interval

−DCCOLT events document how long the data collection actually took in seconds

Page 33: Availability Manager V3.0-2 Overview

Availability Manager Data Collection Considerations• Lock contention data in particular can take

many continuations to finish−1K resource hash table entries were scanned per

collection, so large tables resulted in hundreds of continuations

−Since the data collection time is returned in the AMDS packet, the number of hash table entries scanned is now limited by a 1ms limit. This was the maximum collection time seen in scanning 1K hash table entries on a DEC 3000/400. On larger Alphas, this time limit results in scanning about 3K hash table entries.

−This limit can be changed if necessary, but take care as the IOLOCK8 spinlock is held during the data collection

Page 34: Availability Manager V3.0-2 Overview

Availability Manager Live Demonstration

• Initial key configuration for Data Analyzer and Data Server

• Overview of new features

Page 35: Availability Manager V3.0-2 Overview

Availability Manager Contact Information

• Barry Kierstein – −[email protected],

[email protected]

• Shubhabrata Bose−[email protected]

• Karthigeyan Kasthuriregan−[email protected]

• Srividhya Subramanian−[email protected]

Page 36: Availability Manager V3.0-2 Overview