condition correlation user guide (5175)

52
SPECTRUM ® Condition Correlation User Guide (5175) r9.0

Upload: others

Post on 12-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Condition Correlation User Guide (5175)

SPECTRUM ®

Condition Correlation User Guide (5175)r9.0

Page 2: Condition Correlation User Guide (5175)

This documentation and any related computer software help programs (hereinafter referred to as the “Documentation”) is for the end user’s informational purposes only and is subject to change or withdrawal by CA at any time.

This Documentation may not be copied, transferred, reproduced, disclosed, modified or duplicated, in whole or in part, without the prior written consent of CA. This Documentation is confidential and proprietary information of CA and protected by the copyright laws of the United States and international treaties.

Notwithstanding the foregoing, licensed users may print a reasonable number of copies of the Documentation for their own internal use, and may make one copy of the related software as reasonably required for back-up and disaster recovery purposes, provided that all CA copyright notices and legends are affixed to each reproduced copy. Only authorized employees, consultants, or agents of the user who are bound by the provisions of the license for the product are permitted to have access to such copies.

The right to print copies of the Documentation and to make a copy of the related software is limited to the period during which the applicable license for the product remains in full force and effect. Should the license terminate for any reason, it shall be the user’s responsibility to certify in writing to CA that all copies and partial copies of the Documentation have been returned to CA or destroyed.

EXCEPT AS OTHERWISE STATED IN THE APPLICABLE LICENSE AGREEMENT, TO THE EXTENT PERMITTED BY APPLICABLE LAW, CA PROVIDES THIS DOCUMENTATION “AS IS” WITHOUT WARRANTY OF ANY KIND, INCLUDING WITHOUT LIMITATION, ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NONINFRINGEMENT. IN NO EVENT WILL CA BE LIABLE TO THE END USER OR ANY THIRD PARTY FOR ANY LOSS OR DAMAGE, DIRECT OR INDIRECT, FROM THE USE OF THIS DOCUMENTATION, INCLUDING WITHOUT LIMITATION, LOST PROFITS, BUSINESS INTERRUPTION, GOODWILL, OR LOST DATA, EVEN IF CA IS EXPRESSLY ADVISED OF SUCH LOSS OR DAMAGE.

The use of any product referenced in the Documentation is governed by the end user’s applicable license agreement.

The manufacturer of this Documentation is CA.

Provided with “Restricted Rights.” Use, duplication or disclosure by the United States Government is subject to the restrictions set forth in FAR Sections 12.212, 52.227-14, and 52.227-19(c)(1) - (2) and DFARS Section 252.227-7014(b)(3), as applicable, or their successors.

All trademarks, trade names, service marks, and logos referenced herein belong to their respective companies.

Copyright © 2008 CA. All rights reserved.

Page 3: Condition Correlation User Guide (5175)

Contents

PrefaceWhat Is In This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vText Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viDocumentation Feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viOnline Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

Chapter 1: IntroductionCondition Correlation Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Getting Started. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Open Condition Correlation Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Condition Correlation Editor Window. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Create a Condition Correlation Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Chapter 2: Creating and Managing ConditionsCreate a Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9Edit a Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Delete a Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Chapter 3: Creating and Managing RulesCreate a Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13Edit a Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15Delete a Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Chapter 4: Creating and Managing PoliciesCreate a Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17Edit a Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Delete a Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Chapter 5: Creating and Managing DomainsCreate a Domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Create a Domain in the Condition Correlation Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Create a Domain in the OneClick Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Edit a Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Delete a Domain. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Contents iii

Page 4: Condition Correlation User Guide (5175)

Appendix A: Condition Correlation ExamplesPower Outage Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25WAN Link Failure Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28DiskFull Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

EventDisp Entries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30How to Set Up the Sample Disk Full Condition Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Create Disk Problem Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31Create Disk Problem Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Create the DiskPolicy Policy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Create the DiskMonitorDomain Correlation Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35How Will This Condition Correlation Work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35Create a Clear Events Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Appendix B: Special TopicsCondition Correlation and Fault Isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41About Transfer Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Advanced Correlations and Data Type Comparisons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Index

iv Condition Correlation User Guide

Page 5: Condition Correlation User Guide (5175)

Preface

This guide describes how to use the Condition Correlation Editor to create and manage all Condition Correlation components and to implement a fault-correlation strategy for managing infrastructure alarms. It is intended for network management personnel who have a working knowledge of SPECTRUM event- and alarm-processing concepts, capabilities, and terminology.

What Is In This GuideThis guide is organized as follows:

Chapter 1: Introduction provides an overview of Condition Correlation concepts, describes how to access the Condition Correlation Editor, and provides instructions for defining a correlation domain in SPECTRUM.

Chapter 2: Creating and Managing Conditions describes how to create, edit, and delete correlation conditions.

Chapter 3: Creating and Managing Rules describes how to create, edit, and delete correlation rules.

Chapter 4: Creating and Managing Policies describes how to create, edit, and delete correlation policies.

Chapter 5: Creating and Managing Domains describes how to create, edit, and delete correlation domains.

Appendix A: Condition Correlation Examples provides specific examples of how Condition Correlation can be used.

Appendix B: Special Topics discusses special topics related to Condition Correlation capacities and implementation.

v

Page 6: Condition Correlation User Guide (5175)

Text Conventions

Text ConventionsThe following text conventions are used in this document:

Documentation FeedbackTo send feedback regarding SPECTRUM documentation, access the following web address:

http://supportconnectw.ca.com/public/ca_common_docs/docserver_email.asp

Thank you for helping us improve our documentation.

Element Convention Used Example

Variables

(The user supplies a value for the variable.)

Code format and Italic in angle brackets (<>)

Type the following:

DISPLAY=<workstation name>:0.0 export display

The directory where you installed SPECTRUM

(The user supplies a value for the variable.)

<$SPECROOT> Navigate to:

<$SPECROOT>/app-defaults

Linux, Solaris, and Windows directory paths

Unless otherwise noted, directory paths are common to all operating systems, with the exception that slashes (/) should be used in Linux and Solaris paths, and backslashes (\) should be used in Windows paths.

<$SPECROOT>/app-defaults on Linux and Solaris is equivalent to <$SPECROOT>\app-defaults on Windows.

On-screen text Code format The following line displays:

path="/audit"

User-typed text Code format Type the following path name:

C:\ABC\lib\db

vi Condition Correlation User Guide

Page 7: Condition Correlation User Guide (5175)

Online Documentation

Online DocumentationSPECTRUM documentation is available online at the following address:

http://support.concord.com/support/secure/products/Spectrum_Doc/

Check this site for the latest updates and additions.

vii

Page 8: Condition Correlation User Guide (5175)

Online Documentation

viii Condition Correlation User Guide

Page 9: Condition Correlation User Guide (5175)

Chapter 1: Introduction

An integral member of the SPECTRUM family of root-cause analysis and service management technologies, Condition Correlation lets you set up a system in SPECTRUM to determine the root-cause alarm among a barrage of seemingly unrelated alarms from a heterogeneous group of managed infrastructure resources (models). You use Condition Correlation to define the criteria that specifies the causal problem event which precipitates a specific set of the symptomatic problem events. You select a set of resources (models) that you would like the correlation to consider and you define this as the correlation domain. If the logical criteria you defined for this correlation are met, SPECTRUM generates an alarm for the root-cause event, suppresses the alarms that are symptomatic of the root-cause alarm, and indicates the causal or symptomatic relationship in the alarm panel in OneClick.

Using Condition Correlation to help manage trouble-prone segments of your infrastructure is beneficial in the following ways:

It lets you correctly and efficiently respond to the real problem rather than symptomatic problems.

It lets you confirm and track problem trends and interdependencies.

It lets you respond quickly to changes in the infrastructure by letting you manage multiple Condition Correlation implementations from a single landscape.

Condition Correlation ComponentsWhen you work with Condition Correlation, you construct a system comprising components that define fault indicators, that determine how faults are associated, and which specify the resources evaluated by the system. This section provides an overview of these components:

Conditions on page 2

Rules on page 3

Policies on page 4

Domains on page 4

Introduction 1

Page 10: Condition Correlation User Guide (5175)

Condition Correlation Components

As you review the component descriptions, it is recommended that you also examine the predefined, CA-authored components included with the Condition Correlation application to better understand them. See Open Condition Correlation Editor on page 5 for information about accessing Condition Correlation components.

Conditions

A condition is the basic building block of the correlation system. A condition, like a SPECTRUM alarm, is a transitory occurrence on a resource, such as a status change. It exists only as long as the criteria that produces it are met. Like an alarm, a condition is always initiated by a set event and can be cleared by a clear event. When you define a condition, you identify the set and clear event types.

Note: Conditions can also be cleared if the associated alarm (created by the set event) is destroyed or if the condition was an implied one, meaning it was created by a rule (through its set event), and no set of conditions fulfills the rule anymore. In this case, the condition is cleared automatically.

You can also define parameters for a condition, used to establish correlation criteria when you create correlation rules. A parameter can be any event variable data or any model attribute of the model associated with the condition. You can create new parameters, or you can create modified versions of existing parameters. See Creating and Managing Conditions on page 9 for more information about working with conditions.

Note: A correlation condition has no relationship with the condition attribute for a SPECTRUM model.

Conditions and SPECTRUM Alarms

It is a good idea to define conditions that correspond to SPECTRUM alarms. First, if a condition’s set event is the same as an alarm’s set event, the condition instance instantiated when the alarm is generated and the alarm itself are linked within the correlation system. Second, this link lets Condition Correlation hide symptomatic alarms from the main alarm list in OneClick and relate symptomatic alarms to root-cause alarms. The symptomatic alarms, however, are listed in the Symptoms list of the root-cause alarm under the Impact tab.

Important! Alarms present at startup are not correlated.

2 Condition Correlation User Guide

Page 11: Condition Correlation User Guide (5175)

Condition Correlation Components

Rules

A rule defines the relationship between two or more conditions when specific criteria are met. You can define a rule to stipulate that one condition is a symptom of, or the cause of, another. For example, you could associate a symptomatic SPM test threshold violation condition to a root-cause port LinkDown condition and apply this rule, in a policy, to a set of SPM test and port models, in a domain. Additionally, you can create a rule to indicate that one or more conditions imply that another exists.

Rules may use conditions in the following ways:

Exists: The condition is in the correlation domain.

Not Exists: The condition is not in the correlation domain. This lets you create rules that can only be satisfied if the condition does not exist in the correlation domain.

Counts: The condition is in the correlation domain, and it allows totals/limits/range comparisons using the Advanced Rule Criteria section of the Create Correlation Rule dialog. This lets you create rules that are satisfied only if a certain number of a particular condition exists, reaches a limit, or is in a user-defined range. In this case there is a ‘Condition Count’ parameter available for use in the rule criteria, but the other parameters for that condition may not be used.

See Creating and Managing Rules on page 13 for more information about working with rules.

Rule Patterns

Rules can be expressed by any of the following patterns:

Caused By on page 3

Implies on page 3

Implied Cause on page 4

Caused ByCondition X or a set of conditions is caused by condition Z

When all of the symptom conditions exist (and the rule criteria apply) and the root cause condition Z exists, the correlation is made. If Z is associated with an alarm, all symptomatic alarms are then hidden beneath it. The models remain the condition (color) they were before, so even if one model’s yellow alarm hides another model’s red alarm, the other model remains red, but no alarm will be shown.

When any of the conditions are cleared, the correlation is not broken.

ImpliesCondition X or a set of conditions implies condition Z

Introduction 3

Page 12: Condition Correlation User Guide (5175)

Condition Correlation Components

When all the symptoms exist and the rule criteria apply, the root cause condition Z is created. Thus its set event is created, which then might create an alarm. If any of the symptoms are subsequently cleared, and no other set supports the rule anymore, Z is also cleared. The condition is then cleared anyway, but if it created an alarm, it depends if the condition has a clear event, and if that clear event clears the alarm. Therefore, depending on how you configure it, the alarm may remain.

Implied CauseCondition X or a set of conditions is the implied cause of condition Z

This pattern combines both the previous patterns: Caused By and Implies.

Other Patterns

Condition Correlation also lets you construct more granular rule patterns using additional rule criteria that must be met before a correlation is established between two conditions. The criteria can be specified in terms of comparison between parameters for one condition and parameters of another or in terms of specific values.

For example, an instance of a LinkDown condition on a port model could be caused by an instance of a BoardPulled condition on a board if the slot number of the port is equal to the slot number of the board and both the port and the board are from the same device.

Policies

A policy is a set of one or more rules. You can group any number of rules in a policy, and you can apply one or more policies to any number of resource groups (in a domain). Using policies simplifies the implementation of rules for multiple domains. When you add, edit, or remove rules from a policy, all implementations of the policy are updated as well.

See Creating and Managing Policies on page 17 for information about working with policies.

Domains

A domain is a group of resources collectively assessed by Condition Correlation based on the rules in the policies applied to it. A domain can include any number of diverse models of different model types, and it can have any number of policies applied to it. A domain is a SPECTRUM container model, and thus when you specify the resources in a domain, you are explicitly defining what you want and what you do not want evaluated by the policy or policies applied to it.

4 Condition Correlation User Guide

Page 13: Condition Correlation User Guide (5175)

Getting Started

There are a number of ways to create a domain and populate it with resources. You can create a new domain and add resources on a per-resource basis, or you can create a domain from a service or Global Collection model, both of which are entities representing a collection of resources.

See Creating and Managing Domains on page 21 for information about working with domains.

Getting StartedThis section describes how to access the Condition Correlation Editor, the utility with which you create and manage your correlation systems, and its functional parts. It also provides an overview of the process involved in creating a correlation system and describes how to monitor Correlation domain containers in OneClick.

Open Condition Correlation Editor

The Condition Correlation Editor lets you configure all Condition Correlation component settings. You must have OneClick administrative privileges to access the Condition Correlation Editor.

To open the Condition Correlation Editor

1. Open OneClick.

2. Select Tools, Utilities, Condition Correlation Editor.

The Condition Correlation Editor window opens. By default, it displays the Conditions tab list and any parameters defined for the selected condition.

Introduction 5

Page 14: Condition Correlation User Guide (5175)

Getting Started

Condition Correlation Editor Window

The Condition Correlation Editor window lets you create and manage all correlation system components. It also lists and provides access to all predefined (CA-authored) and user-defined components. The following is an example of the Condition Correlation Editor window:

The Condition Correlation Editor window is comprised of the following:

Conditions tab

Invokes a list of CA-authored and user-authored conditions and a list of parameters (Parameters tab) for the selected condition. Not all conditions include parameters.

Rules tab

Invokes a list of CA-authored and user-authored rules and the rule’s correlation criteria (Rule Criteria tab) for the selected rule.

Policies tab

Invokes a list of CA-authored and user-authored policies and a list of rules (Rules tab) included in the selected policy.

6 Condition Correlation User Guide

Page 15: Condition Correlation User Guide (5175)

Create a Condition Correlation Domain

Domains tab

Invokes a list of CA-authored and user-authored domains, a list of policies (Policies tab) applied to the selected domain, and a list of resources (Resources tab) included in the domain.

Filter field

Specifies the conditions, rules, policies, or domains list entry you want to display.

Create button

Depending on which tab you are working in, invokes the Create Correlation Condition, Create Correlation Rule, Create Correlation Policy, or Create Correlation Domain dialog.

Edit button

Depending on which tab you are working in, invokes the Edit Correlation Condition, Edit Correlation Rule, Edit Correlation Policy, or Edit Correlation Domain dialog.

Delete button

Deletes the selected condition, rule, policy, or domain.

Create a Condition Correlation DomainDeploying a correlation system to a particular group of managed infrastructure resources is synonymous with creating a correlation domain that encompasses those resources and that has a correlation policy applied to it. Once the domain has been created on a landscape, the correlation system is in effect.

Note: Condition correlations are implemented in the SpectroSERVER or SpectroSERVERs, therefore, they are not affected if the OneClick web server is stopped or started.

This section provides instructions for constructing the components required to implement the domain. You are not required to follow the instruction’s exact order. What is important is that you verify that a domain has at least one policy applied to it, that the policy includes at least one rule, and that the rule criteria is logically appropriate for the conditions it evaluates for correlative association.

Important! Make every attempt to induce the problem situation you want to manage to test the viability of the correlation system before you deploy the system in a mission-critical environment.

To create a correlation domain

1. Create a domain and add the resources you want included in the domain. In a later step, you will apply one or more correlation policies to the domain.

Once you have created a domain, you can add resources to it and remove resources from it at any time.

Introduction 7

Page 16: Condition Correlation User Guide (5175)

Create a Condition Correlation Domain

2. Create one or more conditions you want evaluated by correlation rule criteria. If, however, you want to use available conditions only, skip this step.

3. Create the rule or rules that establish root-cause condition and symptomatic condition associations if criteria specified by the rules are met. If, however, you want to use available rules only, such as rules that specify predefined conditions, skip this step.

A rule evaluates two or more conditions. If rule criteria are met, Condition Correlation identifies one condition as the root-cause condition and the other conditions as symptomatic of the root-cause condition. Once you have created a rule, you can modify its criteria or the conditions it evaluates at any time.

4. Create the policy or policies that contain the correlation rules you want to associate with the domain. If, however, you want to use available policies only, skip this step.

Note: You can add rules to or remove rules from a policy at any time.

5. Apply one or more policies to the domain.

Note: Existing correlation domains will adjust to policy changes automatically, keeping the correct correlation state as much as possible.

At this point, the Condition Correlation process is in effect for the resources included in the domain. The domain is modeled as a correlation domain container in OneClick. The following image shows an example domain container and the resources included in it:

8 Condition Correlation User Guide

Page 17: Condition Correlation User Guide (5175)

Chapter 2: Creating and Managing Conditions

You can use predefined conditions provided with Condition Correlation to set up correlation rules, or you can create your own. This chapter describes how to create new conditions, create new versions of other conditions, and edit condition settings.

Create a ConditionYou can create a new condition using either of the following methods:

Create a new condition.

Create a modified version of a existing condition.

To create a condition

1. Click the Conditions tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-created conditions.

2. Do one of the following:

Click Create to create a new condition.

The Create Correlation Condition dialog opens but does not include default settings.

Select the condition from which you want to create a new condition, and then click Edit.

The Edit Correlation Condition dialog opens, displaying the property settings for the condition you selected.

3. Provide or edit values for the following condition properties:

Condition Name

Defines the condition name.

Set Event Code

Identifies the SPECTRUM event code associated with the condition.

Creating and Managing Conditions 9

Page 18: Condition Correlation User Guide (5175)

Create a Condition

Clear Event Code

(Optional) Identifies the SPECTRUM clear event code associated with the condition.

4. (Optional, for advanced correlation only) Specify one or more parameters that can be used to determine if a correlation should be made between instances of the new condition. A parameter can be any event variable data or any model attribute of the model associated with the condition. You can create new parameters, or you can create modified versions of existing parameters.

Parameter values are filled in at the time the condition is created, from the event which created it, or from the model the event was created on. These parameters are then available in the Advanced Rule Criteria section.

For count conditions, a count parameter is made available automatically, once you chose that condition type in the rule.

Configure parameters using one of the following methods:

To create a new parameter, click Create in the Parameters section.

The Create Correlation Parameter dialog opens.

To modify an existing parameter or to create a new parameter from an existing parameter, select the parameter you want to modify, and then click Edit in the Parameters section.

The Edit Correlation Parameter dialog opens.

5. Provide or edit values for the following parameter properties:

Parameter Name

Defines the Parameter name. Provide a name that indicates the parameter type.

Parameter Type

Specifies the type of parameter. Choose one of the following options:

Model Attribute: Specifies a model attribute parameter type.

Var Bind: Specifies a varbind parameter type.

Predefined: Specifies either a Model, Model Type, or Device Model.

Parameter ID

Defines the ID number for the type of parameter indicated in the Parameter Type field.

If you chose Model Attribute, click Attribute to open the Attribute Selector dialog and then select the appropriate Model Attribute ID.

If you chose Var Bind, enter the varbind variable number associated with the trap for the model associated with the condition.

10 Condition Correlation User Guide

Page 19: Condition Correlation User Guide (5175)

Create a Condition

If you chose Predefined, select one of the following from the adjacent Parameter Type drop-down list:

Model: Enter the Model_Handle associated with the condition (Attribute ID 0x129fa).

Model Type: Enter the Model_Type_Handle of the model associated with the condition (Attribute ID 0x10001).

Device Model: Enter the Device_Mdl_Handle of the model associated with the condition (Attribute ID 0x10069).

Use as discriminator

Select the ‘Use as discriminator’ check box to designate this parameter as a discriminator. Designating a parameter as a discriminator lets clear events clear only those set events that include parameter values which match those included in the clear event. You can designate multiple parameters as discriminators. When condition parameters are designated as discriminators, the condition maintains the parameter values from the time the set event produces the condition. A condition can only be cleared if the clear event contains parameters values that match those found in the set event.

Note: While you may want to use different discriminators for special situations, it is often best to use the same condition discriminators used by the associated alarm. If you use the same condition discriminators as the alarm, the conditions will match the alarms and will clear accordingly.

6. Click Create to add the new parameter, or click OK to save modifications to an existing parameter.

7. Click Create to save the new condition.

The Condition Correlation Editor saves the new condition to the Conditions tab list. The Author property identifies you as the condition’s author.

Creating and Managing Conditions 11

Page 20: Condition Correlation User Guide (5175)

Edit a Condition

Edit a ConditionYou can edit properties for all CA-authored and user-authored conditions.

Important! Any changes you make to existing conditions will force Condition Correlation to drop all current conditions of the same type.

To edit a condition

1. Click the Conditions tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of conditions.

2. Select the condition you want to edit, and then click Edit.

The Edit Correlation Condition dialog opens.

3. Edit condition properties as described in Create a Condition on page 9, and then click OK.

Condition Correlation Editor saves your changes.

Delete a ConditionYou can permanently delete user-authored conditions, but you cannot permanently delete CA-authored conditions. If you or another user has edited and assumed ownership of a CA-authored condition, you can delete it only temporarily because Condition Correlation Editor restores the original CA-authored condition with its default settings the next time the OneClick server restarts. You also cannot delete a condition which is still in use by a rule.

To delete a condition

1. Click the Conditions tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-authored conditions.

2. Select the user-authored condition you want to delete, and then click Delete.

Condition Correlation removes the condition from the Conditions tab list.

12 Condition Correlation User Guide

Page 21: Condition Correlation User Guide (5175)

Chapter 3: Creating and Managing Rules

You can use predefined rules to set up Correlation policies, or you can create your own. This section describes how to create new rules, create new versions of existing rules, and edit rule settings.

Create a RuleYou can create a rule using either of the following methods:

Create a new rule.

Create a modified version of a existing rule.

To create a rule

1. Click the Rules tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-authored rules.

2. Choose one of the following options:

Click Create to create a new rule.

The Create Correlation Rule dialog opens.

Select the rule from which you want to create a new rule, and then click Edit.

The Edit Correlation Rule dialog opens, displaying the property settings for the rule you selected.

3. Enter a name for the rule in the Rule Name field.

Creating and Managing Rules 13

Page 22: Condition Correlation User Guide (5175)

Create a Rule

4. Click ‘set’ in the Symptom Condition(s) list to specify whether you want the condition to belong to a correlation domain and then choose one of the following options:

Exists: The condition is in the correlation domain.

Not Exists: The condition is not in the correlation domain. This lets you create rules that can only be satisfied if the condition does not exist in the correlation domain.

Counts: The condition is in the correlation domain, and it allows totals/limits/range comparisons using the Advanced Rule Criteria section of the Create Correlation Rule dialog. This lets you create rules that are satisfied only if a certain number of a particular condition exists, reaches a limit, or is in a user-defined range.

When using a condition for counting, a new parameter is automatically created for that condition named “Condition Count.” This count can be used in the Advanced Rule Criteria section as shown in the following example:

TestCondition.Condition Count GREATER_THAN 10.

No other parameter may be used for counted conditions because multiple copies are usually present, and Condition Correlation would not know from which condition the parameter value should be taken.

5. Select one of the following values from the Relationship drop-down list to specify the relationship between symptomatic conditions and the root cause condition.

Caused By: The alarm associated with the root cause condition caused the associated symptomatic conditions. When the rule is met, OneClick suppresses the alarms for symptomatic conditions and lists them as symptoms under the Alarms view Impact tab in OneClick.

Implies: The symptomatic conditions suggest the existence of another condition that may be unknown to the management system. When the rule is satisfied, the set event of the implied condition is processed on the target model. Though this may raise an alarm on the target model, OneClick will not suppress the alarms for symptomatic conditions.

Implied Cause: This rule incorporates the logic of the Caused By and Implies rules. The symptomatic conditions are indicative of another condition. The set event of this implied condition is processed on the target model. If this event raises an alarm on the target model, OneClick will suppress the alarms associated with the symptomatic conditions and will list the suppressed alarms as symptoms of the root cause alarm under the Impact tab in OneClick.

Note: If you choose Implies or Implied Cause, the Root Cause Target selection box appears on the Correlation Rule dialog. It lets you specify whether the alarm should be generated on the correlation domain with which the rule is associated or on one of the symptomatic conditions.

To put the implied alarm (event) on a specific model, add the predefined “Model” parameter to the condition that you know will be created on the target model. Then select this condition and the “Model” parameter as the root cause target from the Root Cause Target section.

14 Condition Correlation User Guide

Page 23: Condition Correlation User Guide (5175)

Edit a Rule

If you want to imply the condition (event/alarm) on a model where you do not currently have an alarm, for example, a container, you can include that model in the correlation as well, select the “Model Active” condition for that model, and add some rule criteria to identify the correct Model Active condition. For example, for a port alarm, add the “Device Model” parameter to the port condition and then add a criteria in the rule that “Model Active.Model EQUAL TO PortCondition.Device Model.” This helps ensure that the implied condition gets created on the desired model. The “Model Active” condition is created once for each model participating in the correlation domain.

6. Select a condition from the Root Cause Condition dialog that caused, or was the implied cause, of the symptomatic conditions.

Note: You can select only one root cause condition for a rule.

7. (Optional) Click Show Advanced to open the Rule Criteria workspace.

You can use advanced rule criteria when you have specified condition parameters and you want to establish correlation criteria based on parameter values.

8. Click Create in the Create Correlation Rule dialog or click OK in the Edit Correlation Rule dialog to save the new rule.

The Condition Correlation Editor saves the new rule to the Rules tab list. The Author property identifies you as the rule’s author.

Edit a RuleYou can edit properties for all CA-authored and user-authored rules.

To edit a rule

1. Click the Rules tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-authored rules.

2. Select the rule you want to edit, and then click Edit.

The Edit Correlation Rule dialog opens.

Note: You cannot edit a rule name when the rule is specified in a policy.

3. Edit the rule’s properties, as described in Create a Rule on page 13, and then click OK.

Condition Correlation Editor saves your changes with your user name identified in the Author field.

Creating and Managing Rules 15

Page 24: Condition Correlation User Guide (5175)

Delete a Rule

Delete a RuleYou can permanently delete user-authored rules, but you cannot permanently delete CA-authored rules. If you or another user has edited and assumed ownership of a CA-authored rule, you can delete it only temporarily because Condition Correlation Editor restores the original CA-authored rule with its default settings the next time the OneClick server restarts.

To delete a rule

1. Click the Rules tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-authored rules.

2. Select the user-authored rule you want to delete, and then click Delete.

Condition Correlation removes the rule from the Rules tab list.

16 Condition Correlation User Guide

Page 25: Condition Correlation User Guide (5175)

Chapter 4: Creating and Managing Policies

You can save one or more correlation rules to a correlation policy, which you can then apply to any correlation domain. You can use predefined policies, or you can create your own. This section describes how to create new policies, how to create new versions of existing policies, and how to edit policy settings.

Create a PolicyYou can create a new policy using either of the following methods:

Create a new policy.

Create a modified version of a existing policy.

To create a policy

1. Click the Policies tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of CA-authored and user-authored policies.

2. Choose one of the following options:

Click Create to create a new policy.

The Create Correlation Policy dialog opens.

Select the policy from which you want to create a new policy, and then click Edit.

The Edit Correlation Policy dialog opens, displaying the property settings for the policy you selected.

Creating and Managing Policies 17

Page 26: Condition Correlation User Guide (5175)

Edit a Policy

3. Provide values for or edit the following policy properties:

Policy Name

Defines the policy name.

Policy Rules

Includes the rules for the policy. Use the arrow buttons to add rules from the Available Rules box to the Policy Rules box and to remove rules from the Policy Rules box.

4. Click Create to save the new policy.

The Condition Correlation Editor saves the new policy to the Policies tab list. The Author property identifies you as the policy’s author.

Edit a PolicyYou can edit properties for all CA-authored and user-authored policies.

To edit a policy

1. Click the Policies tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of policies.

2. Select the policy you want to modify, and then click Edit.

The Edit Correlation Policy dialog opens.

Note: You cannot edit a policy name if the policy is applied to a correlation domain.

3. Edit policy properties as described in To create a policy on page 17, and then click OK.

Condition Correlation Editor saves your changes with your user name identified in the Author field.

18 Condition Correlation User Guide

Page 27: Condition Correlation User Guide (5175)

Delete a Policy

Delete a PolicyYou can permanently delete user-authored policies, but you cannot permanently delete CA-authored policies. If you or another user has edited and assumed ownership of a CA-authored policy, you can delete it only temporarily because Condition Correlation Editor restores the original CA-authored policy with its default settings at the next OneClick server restart.

To delete a policy

1. Click the Policies tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of policies.

2. Select the user-defined policy you want to delete, and then click Delete.

3. Click Yes in the confirmation dialog that appears.

Condition Correlation Editor removes the policy from the Policies tab list.

Creating and Managing Policies 19

Page 28: Condition Correlation User Guide (5175)

Delete a Policy

20 Condition Correlation User Guide

Page 29: Condition Correlation User Guide (5175)

Chapter 5: Creating and Managing Domains

You can create correlation domains containing as many different correlation policies for as many different types of managed resources as you require to manage alarm events. This section describes how to create new domains and edit domain settings.

Important! The volume of correlation processing required for very large domains might degrade SPECTRUM performance.

Create a DomainYou can create a new domain using either of the following methods in the Condition Correlation Editor:

Create a new domain.

Create a modified version of a existing domain.

You can also create a new domain from the context of a device, service, or Global Collection model you want to add to the domain using the OneClick ‘Add To’ feature.

Note: If you plan to add resources to the domain from multiple landscapes, you must create the domain on the Main Location Server.

Creating and Managing Domains 21

Page 30: Condition Correlation User Guide (5175)

Create a Domain

Create a Domain in the Condition Correlation Editor

You can create a new domain or create a new version of an existing domain in the Condition Correlation Editor.

To create a domain in the Condition Correlation Editor

1. Click the Domains tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of any domains that users have created. Condition Correlation does not include default domains.

2. Choose one of the following options:

Click Create to create a new domain.

The Create Correlation Domain dialog opens.

Select the domain from which you want to create a new domain, and then click Edit. If no domains are listed, none have been created yet and you must create a domain.

The Edit Correlation Domain dialog opens.

3. Provide or edit values for the following domain properties:

Domain Name

Defines the domain name.

Landscape

Defines the landscape for the domain.

4. Add one or more policies from the Available Policies box to the Domain Policies box. If you are creating another version of an existing domain, remove policies as required from the Domain Policies box.

5. Add or remove resources from the domain. If you are creating a new domain, add resources to it. If you are creating a new domain from an existing one, you can add and remove resources as required.

Note: Keep in mind that if you want to add a device model and a number of port models for the device, you must add each particular model to the domain. Adding the device model does not add its component models to the domain.

To add resources:

a. Click the Resources tab, and then click Add.

The Locate Resources dialog opens.

b. Search for the resources you want to add to the domain in the ‘Search using’ column.

c. Select the resources you want to add to the domain from the returned search list, click ‘Add Selected to Correlation Domain,’ and then click Close.

The resources you added appear under the Resources tab in the Create Correlation Domain or Edit Correlation Domain dialog.

22 Condition Correlation User Guide

Page 31: Condition Correlation User Guide (5175)

Create a Domain

To remove resources from a domain from which you are creating a new domain:

a. Select the resources you want to remove from the Resources tab list in the Edit Correlation Domain dialog.

b. Click Remove.

The resources are removed from the Resources tab list.

6. Click Create in the Create Correlation Domain dialog or the Edit Correlation Domain dialog.

The Condition Correlation Editor saves the new domain to the Domains tab list.

Create a Domain in the OneClick Console

You can use the OneClick ‘Add To’ feature to create a new domain from the context of a device, service, or Global Collection model.

To create a domain from a device, service, or Global Collection model

1. Select the model in OneClick that you want to use to create a domain.

2. Right-click the model, and select Add To, Correlation Domain.

The ‘Add to Correlation Domain’ dialog opens.

3. Do one of the following:

To create a new domain, enter a name for the domain and specify the landscape where you want to create the new domain in the ‘Create a new correlation domain’ section.

To include the device, service, or Global Collection model in an existing domain, choose the existing domain from the list in the ‘Select an existing correlation domain’ section.

4. Click OK.

If you created a new domain, see Create a Domain in the Condition Correlation Editor on page 22 for information about how to add policies to the domain.

Creating and Managing Domains 23

Page 32: Condition Correlation User Guide (5175)

Edit a Domain

Edit a DomainYou can edit all domain properties.

To edit a domain

1. Click the Domains tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of domains.

2. Select the domain you want to edit, and then click Edit.

The Edit Correlation Domain dialog opens.

3. Edit Domain properties as described in To create a domain in the Condition Correlation Editor on page 22, and then click OK.

Condition Correlation Editor saves your changes.

Delete a DomainYou can delete any domain created by any user.

To delete a domain

1. Click the Domains tab in the Condition Correlation Editor window.

The Condition Correlation Editor window displays a list of domains.

2. Select the domain you want to delete, and then click Delete.

3. Click Yes in the confirmation dialog that appears.

Condition Correlation Editor removes the domain from the Domains tab list.

24 Condition Correlation User Guide

Page 33: Condition Correlation User Guide (5175)

Appendix A: Condition Correlation Examples

This appendix provides the following examples to help you see how you can implement Condition Correlation in your own network environment:

Power Outage Example on page 25

WAN Link Failure Example on page 28

DiskFull Example on page 30

Note: All the fictitious instances of alarms and Condition Correlation components referenced in the following examples are enclosed in double quotation marks (“ ”). References to actual events and alarms defined in SPECTRUM are not enclosed in quotation marks.

Power Outage ExampleThis example describes one way that Condition Correlation can be used to pinpoint the root-cause alarm and symptomatic alarms among a barrage of alarms SPECTRUM generates for different resources as a result of an infrastructure power outage.

Scenario

When a power outage occurs, managed UPS systems generate traps indicating that they have switched to backup battery power. When backup battery power wanes, the systems generate traps indicating low battery power. When the batteries die, managed devices connected to the UPS systems go down. This precipitates a flood of events and alarms from the affected area, making it difficult for network management personnel to pinpoint and address the root-cause problem.

Condition Correlation Examples 25

Page 34: Condition Correlation User Guide (5175)

Power Outage Example

Correlation Strategy

Determining the root cause of the problem lets troubleshooters restore the infrastructure quickly. This can be accomplished by setting up a correlation system that produces a single alarm for the power outage with an impact that includes all the lost contact devices and UPS alarms listed as suppressed alarm symptoms.

A correlation policy that stipulates the following can be applied to the domain, including the resources that would be compromised by a power outage:

When all UPS systems within a specific area, or domain, fall over to battery backup power, a widespread, critical power outage in that area is likely.

When this condition exists, a power outage is assumed to be the root cause of all low battery alarms from the UPS systems and all lost contact alarms and other associated alarms for the devices connected to the power systems.

Configurations

The correlation system can be configured as follows:

In SPECTRUM, a new “Power Outage” alarm is created. A set event and clear event for the new alarm is required.

Note: For more information about creating alarms and editing event configuration files, see the Event Configuration User Guide (5188).

Two conditions are created:

– A “Power_Outage” condition that uses the set event code and the clear event code associated with the “Power Outage” alarm.

– A “Battery_On” condition for a UPS on battery power using the set event code and the clear event code associated with the UPS trap.

Three rules are created:

– A “Battery On -> Power Outage” rule that stipulates when, for example, five or more power systems go on battery power, the “Power Outage” condition is the implied cause.

u Symptom Condition(s): Battery On, Set: Counts

u Root Cause Condition: Power Outage

u Relationship: Implied Cause

u Root Cause Target: Correlation Domain

u Advanced Rule Criteria: Configured as shown in the following illustration:

26 Condition Correlation User Guide

Page 35: Condition Correlation User Guide (5175)

Power Outage Example

Configuring the Example Battery On -> Power Outage Rule:

– A “ContactLost_Red -> Power Outage” rule which stipulates that if the ContactLost_Red condition (CA-authored) is caused by the “Power Outage” condition, the critical (red) Contact Lost alarm is suppressed as a symptom of the “Power Outage” alarm.

– A “ContactLost_Gray -> Power Outage” rule that stipulates if the ContactLost_Gray condition (CA-authored) is caused by the “Power Outage” condition, the suppressed (gray) Contact Lost alarm is suppressed as a symptom of the “Power Outage” alarm.

A “Power_Outage” policy is created that includes the “Battery On -> Power Outage,” “ContactLost_Red -> Power Outage,” and “ContactLost_Gray -> Power Outage” rules.

A “Backup_Power” domain is created that includes the UPS models and the device models that connect to the power supplies, and the “Power_Outage” policy is applied to the domain.

Condition Correlation Examples 27

Page 36: Condition Correlation User Guide (5175)

WAN Link Failure Example

WAN Link Failure ExampleThis example describes one way that Condition Correlation can be used to pinpoint the root-cause alarm and symptomatic alarms among a barrage of alarms generated for different resources as a result of a WAN link failure.

Scenario

In many WANs, primary connections have a backup. The backup connection typically provides less bandwidth than the primary connection. In this example, a 384K Frame Relay link is backed by a 128K ISDN link. Also, a SPECTRUM Service Performance Manager (SPM) test is measuring latency across the WAN link.

When the Frame Relay link goes down, the ISDN link takes over and the SPM test may exceed the latency threshold due to the reduced bandwidth. SPECTRUM generates two alarms (Critical Alarm - Frame Relay Link Down occurs on the Frame Relay link model; Minor Alarm - SPM Test Exceeded Threshold occurs on the SPM Test model) and one event (ISDN Backup Active occurs on the device).

Correlation Strategy

It may not be apparent to network management personnel that these three conditions are related, and they are likely to focus their effort on the critical alarm even though there may be other alarms in the infrastructure that are more important.

A correlation policy that stipulates the following can be applied to the domain including the resources that would be compromised by a primary WAN link failure:

The two alarms and the ISDN event can be correlated to generate a new primary link down, reduced bandwidth condition that produces a major alarm because the WAN is still working, but with decreased performance.

The failed Frame Relay link can be correlated with the backup’s Dialup Link Active event and imply that the primary WAN link is down with reduced bandwidth if the backup link bandwidth is less than the primary link bandwidth.

The SPM test can be correlated with the primary WAN link down, reduced bandwidth condition with a rule that stipulates SPM Test Threshold Exceeded is caused by primary WAN link down, reduced bandwidth condition.

This correlation system will produce the following alarm and event information: the single major alarm for the WAN link being down. There will also be an SPM test threshold exceeded alarm and the ISDN backup active event, but these will be hidden under the single major alarm. This lets troubleshooters focus their efforts on the most important alarms. A second rule could be created to produce a minor alarm if the active backup link has the same bandwidth as the failed primary link.

28 Condition Correlation User Guide

Page 37: Condition Correlation User Guide (5175)

WAN Link Failure Example

Configurations

The correlation system can be configured as follows:

In SPECTRUM, a new “Primary WAN Link Down, Reduced Bandwidth” alarm is created. A set event and clear event for the new alarm is required.

Note: For more information about creating alarms and editing event configuration files, see the Event Configuration User Guide (5188).

Two conditions are created:

– A “Primary_WAN_Link_Down_Reduced_Bandwidth” condition using the set and clear event codes from the “Primary WAN Link Down, Reduced Bandwidth” alarm.

– A “Dialup_Link_Active” condition using set event 0x022ffff6, “Dialup link has been activated,” and clear event 0x022ffffc, “Dialup link is inactive.” This condition is not linked to a SPECTRUM alarm. However, it infers that the backup, or secondary, link is up and running.

Two rules are created:

– A “PrimaryFrameRelay_Red -> LinkDown” rule that states that if the “Primary_WAN_Link_Down_Reduced_Bandwidth” and “Dialup_Link_Active” conditions occur, then the implied cause is the “Primary_WAN_Link_Down_Reduced_Bandwidth” condition and the critical (red) Frame Relay Link Down alarm is suppressed by the “Primary WAN Down, Decreased Bandwidth” alarm (orange).

– An “SPMLatencyThreshold_yellow -> Violated” rule that states that the SPM latency threshold violation is caused by the “Primary WAN Link Down, Reduced Bandwidth” condition, and yellow SPM latency threshold violation alarms are suppressed.

A “WAN_Link_Failure” policy is created that includes the “PrimaryFrameRelay_Red -> LinkDown” and “SPMLatencyThreshold_yellow -> Violated” rules.

A “WAN_Primary_Backup_Links” domain is created that includes the primary WAN link interfaces, the backup link, and any SPM tests that could be impacted by the backup’s lower bandwidth. The “WAN_Link_Failure” policy is applied to the domain.

Condition Correlation Examples 29

Page 38: Condition Correlation User Guide (5175)

DiskFull Example

DiskFull ExampleThe following more complex example is meant to highlight the things you can do with Condition Correlation in your own network environment. Because it tries to encompass various aspects of condition correlation, it is not necessarily a reflection of the typical way you would use Condition Correlation. Rather, it is intended for illustrative purposes.

Scenario

You have a disk monitor alarm which appears on many models, sometimes even multiple times. However, you are only interested in the total number of these alarms rather than every instance of each alarm. For instance, you feel that fewer than 5 of these disk monitor alarms is acceptable, but once there are at least 5 of them you want to see a minor alarm. Similiarly, if there are more then 10 of them you want to see a major alarm. And, if there are more than 15 of them, you feel that would warrant a critical alarm.

EventDisp Entries

This scenario uses the following EventDisp entries for setting up Condition Correlation. These alarms will use variable binding 4 as a discriminator so that multiple alarms can exist on the same device.

# test alarm (disk full)

0xffff0000 E 50 A 1,0xffff0000,4

0xffff0001 E 50 C 0xffff0000,4

# 5 to 9 test alarms, minor problem with disks

0xffff0010 E 50 A 1,0xffff0010

0xffff0011 E 50 C 0xffff0010

# 10 to 14 test alarms, major problem with disks

0xffff0020 E 50 A 2,0xffff0020

0xffff0021 E 50 C 0xffff0020

# more then 15 test alarms, critical problem with disks

0xffff0030 E 50 A 3,0xffff0030

0xffff0031 E 50 C 0xffff0030

Note: For information about creating event format and alarm probable cause files, see Event Configuration User Guide (5188).

30 Condition Correlation User Guide

Page 39: Condition Correlation User Guide (5175)

DiskFull Example

How to Set Up the Sample Disk Full Condition Correlation

As it is when you set up any Condition Correlation, you must use the following process:

Create Disk Problem Conditions on page 31

Create Disk Problem Rules on page 32

Create the DiskPolicy Policy on page 35

Create the DiskMonitorDomain Correlation Domain on page 35

Create Disk Problem Conditions

Create the following disk conditions:

Disk Full Alarm (including Disk parameter):

– Condition Name: DiskFull

– Set Event Code: 0xffff0000

– Clear Event Code: 0xffff0001

Because there could be multiple occurences of these on a model, you must also set up a parameter. Since the alarms are discriminated on variable binding 4, it is best to use 4 for this parameter as well.

– Parameter Name: Disk

– Parameter Type: Var Bind

– Parameter ID: 4

– Use as discriminator: Yes

Minor Disk Problem:

– Condition Name: MinorDiskProblem

– Set Event Code: 0xffff0010

– Clear Event Code: 0xffff0011

Major Disk Problem:

– Condition Name: MajorDiskProblem

– Set Event Code: 0xffff0020

– Clear Event Code: 0xffff0021

Critical Disk Problem:

– Condition Name: CriticalDiskProblem

– Set Event Code: 0xffff0030

– Clear Event Code: 0xffff0031

Condition Correlation Examples 31

Page 40: Condition Correlation User Guide (5175)

DiskFull Example

Create Disk Problem Rules

For this scenario, you must create the following rules, as described in this section:

Minor Disk Problem Rule on page 32

Major Disk Problem Rule on page 33

Critical Disk Problem Rule on page 34

Minor Disk Problem Rule

To create a rule for the Minor Disk Problem

1. Create a rule using the following properties:

Name: MinorDiskProblemRule

Symptom Condition(s):

– Name: DiskFull

– Type: Counts

Relationship: Implied Cause

Note: This causes the “MinorDiskProblem” alarm to be generated when the rule criteria are satisfied, and causes the rule to hide the DiskFull alarms.

Root Cause Condition: “MinorDiskProblem”

Root Cause Target: Select the ‘Correlation Domain’ option.

2. Click Show Advanced to open the Rule Criteria panel.

3. Create the following rule criteria:

‘DiskFull.count GREATER THAN OR EQUAL TO 5’:

– Condition: DiskFull

– Parameter: Condition Count

– Operator: GREATER THAN OR EQUAL TO

– By Value: Yes

– Value: 5

– Type: Integer

– Click Insert Criterion.

32 Condition Correlation User Guide

Page 41: Condition Correlation User Guide (5175)

DiskFull Example

‘DiskFull.count LESS THAN 10’:

– Condition: DiskFull

– Parameter: Condition Count

– Operator: LESS THAN

– By Value: Yes

– Value: 10

– Type: Integer

– Click Insert Criterion.

4. Click Create.

The new rule is added to the Condition Correlation Editor Rules tab.

Major Disk Problem Rule

To create a rule for the Major Disk Problem

1. Create a rule using the following properties:

Name: MajorDiskProblemRule

Symptom Condition(s):

– Name: DiskFull

– Type: Counts

Relationship: Implied Cause

Root Cause Condition: “MajorDiskProblem”

Root Cause Target: Select the ‘Correlation Domain’ option.

2. Click Show Advanced to open the Rule Criteria panel.

3. Create the following rule criteria:

‘DiskFull.count GREATER THAN OR EQUAL TO 10’:

– Condition: DiskFull

– Parameter: Condition Count

– Operator: GREATER THAN OR EQUAL TO

– By Value: Yes

– Value: 10

– Type: Integer

– Click Insert Criterion.

Condition Correlation Examples 33

Page 42: Condition Correlation User Guide (5175)

DiskFull Example

‘DiskFull.count LESS THAN 15’:

– Condition: DiskFull

– Parameter: Condition Count

– Operator: GREATER THAN OR EQUAL TO

– By Value: Yes

– Value: 15

– Type: Integer

– Click Insert Criterion.

4. Click Create.

The new rule is added to the Condition Correlation Editor Rules tab.

Critical Disk Problem Rule

To create a rule for the Critical Disk Problem

1. Create a rule using the following properties:

Name: CriticalDiskProblemRule

Symptom Condition(s):

– Name: DiskFull

– Type: Counts

Relationship: Implied Cause

Root Cause Condition: “CriticalDiskProblem”

Root Cause Target: Select the ‘Correlation Domain’ option.

2. Click Show Advanced to open the Rule Criteria panel.

3. Create the following rule criteria:

‘DiskFull.count GREATER THAN OR EQUAL TO 15’:

– Condition: DiskFull

– Parameter: Condition Count

– Operator: GREATER THAN OR EQUAL TO

– By Value: Yes

– Value: 15

– Type: Integer

– Click Insert Criterion.

4. Click Create.

The new rule is added to the list of rules in the Condition Correlation Editor.

34 Condition Correlation User Guide

Page 43: Condition Correlation User Guide (5175)

DiskFull Example

Create the DiskPolicy Policy

You must now create a disk policy.

To create the DiskPolicy policy

1. Create a new policy using the name DiskPolicy.

2. Add the following rules to the Policy Rules list:

MinorDiskProblemRule

MajorDiskProblemRule

CriticalDiskProblemRule

3. Click Create.

The new policy appears in the list of policies in the Condition Correlation Editor.

Create the DiskMonitorDomain Correlation Domain

Finally, you must create a new correlation domain to accomodate these components.

To create the DiskMonitorDomain

1. Click the Domain tab in the Condition Correlation Editor.

2. Click Create, and then type DiskMonitorDomain in the Domain Name text box.

3. Click the Policies tab, select DiskPolicy from the Available Policies list and move it into the Domain Policies list.

4. Click the Resources tab and then select any number of host devices as resources.

5. Click Create.

The DiskMonitor Domain is added to the list of Domains in the Condition Correlation Editor.

How Will This Condition Correlation Work?

This section describes how this Disk Full condition correlation will work:

If any of the DiskFull events are generated on the host devices with different values for the disk (variable binding 4), they will simply be displayed on-screen, as long as their overall number only goes up to 4. Once a fifth such alarm is generated on a model of the correlation domain, the first rule, “MinorDiskProblemRule,” instantiates, creating the “MinorDiskProblem” alarm on the correlation domain, and hiding the five DiskFull alarms as symptoms under the “MinorDiskProblem” alarm.

Condition Correlation Examples 35

Page 44: Condition Correlation User Guide (5175)

DiskFull Example

If one or more of the five DiskFull alarms clear, the “MinorDiskProblem” alarm will clear again, showing the previously hidden other four or fewer DiskFull alarms again. If, on the other hand, more DiskFull alarms are generated, once their number reaches at least 10 the “MajorDiskProblem” alarm will be generated. The minor alarm, which covers 5-9 alarms, will disappear, and all DiskFull alarms will now be symptoms of the major alarm.

If the DiskFull alarm numbers fall below 10, you will again see the “MinorDiskProblem” alarm. Similarly, if their numbers go above 14, you will see the “CriticalDiskProblem” alarm. Each time the other disk problem alarms are cleared and all existing DiskFull alarms become symptoms of the respective disk problem alarm.

Create a Clear Events Correlation

This section describes some further functionality that you can implement on this sample Disk Full condition correlation.

Suppose that, as an operator, you see one of the disk problem alarms (minor, major, or critical) and you know that the situation has already been handled, for example, space has been regained and you want to remove these alarms. You can clear the disk problem alarm itself from OneClick. However, when you clear the disk problem alarm, all previously hidden disk full alarms reappear because the alarm it was correlated to has been destroyed. This section describes how you can clear all of these alarms.

The problem is that multiple alarms on multiple models may exist, and the only thing you have is the clear event for one of the disk problem events (minor, major, or critical). You must create the clearing events on the correct model. To do so, perform the following tasks:

Create an Additional Parameter for the DiskFull Condition on page 37

Create an Event Rule That Indicates When a Disk Problem Alarm Was Cleared by a User on page 37

(Optional) Log the Event on page 38

Add an Event to Clear the Disk Full Alarms on page 38

Create the Conditions Required for the Clear Correlation on page 38

Create a Rule to Clear DiskFull Alarms on page 39

36 Condition Correlation User Guide

Page 45: Condition Correlation User Guide (5175)

DiskFull Example

Create an Additional Parameter for the DiskFull Condition

First, you need one additional parameter for the DiskFull condition.

To add a parameter to the DiskFull condition

1. In the Conditions tab, in the Correlation Editor, select the DiskFull condition and then click Edit.

The Edit Correlation Condition dialog appears.

2. Click Create in the Parameters section.

The Create Correlation Parameter dialog appears. You need to add the model where the condition (alarm) exists as shown in the following step.

3. From the Parameter Type field, select Predefined.

The Parameter ID field shows the applicable model handle attribute: 0x129fa.

4. Click Create to add this parameter to the condition.

This condition parameter can now be used in the clearing rule you will create to assert the clear event on the correct model.

Create an Event Rule That Indicates When a Disk Problem Alarm Was Cleared by a User

Next, you need to create an event which tells you when a disk problem alarm has been cleared by the user. This lets you distinguish between instances when the correlation cleared the alarm (which it will do automatically when the number of alarms reaches any of the thresholds) and when a user cleared it from the UI, which indicates that the user knew about the problem and decided it had been resolved.

Since you are going to clear them from the UI, no direct event code (for example, 0xffff0021) is used to clear the alarm. Instead, you have to use one of the alarm status events, in this case ‘0x10706: user has cleared an alarm.’ Inside this event is the Probable Cause Code of the cleared alarm, in varbind 3. You can use this to generate a new event, which can then, as a condition, be used to start the clear correlation.

You need to create an event rule to then generate a “disk problem alarm has been user-cleared” event. The 0x10706 event is already mapped (by default only logged) in the following file:

<$SPECROOT>/SS/CsVendor/Cabletron/EventDisp

Add a new event action to it as follows:

0x00010706 E 50 R CA.EventCondition, \

" { v 3 } == { H 0xffff0010 } ", 0xffff0100, \

" { v 3 } == { H 0xffff0020 } ", 0xffff0100, \

" { v 3 } == { H 0xffff0030 } ", 0xffff0100

Condition Correlation Examples 37

Page 46: Condition Correlation User Guide (5175)

DiskFull Example

(Optional) Log the Event

Optionally, you can log that event in the custom EventDisp file using the following syntax:

0xffff0100 E 50

Add an Event to Clear the Disk Full Alarms

Next, you must add an event to clear the disk full alarms, regardless of their discriminator value, using the clear all (‘A’) alarm clear flag:

0xffff0002 E 50 C 0xffff0000, A

This lets you clear all disk full alarms on a model, even if you do not know the values for their discriminator attributes.

Create the Conditions Required for the Clear Correlation

Reload the EventDisp files and then you can set up the clear correlation as described in this section.

To create the conditions required for the clear correlation

1. Create the following condition which will be used to start the clear correlation:

DiskProblemAlarmUserCleared:

– Condition Name: DiskProblemAlarmUserCleared

– Set Event Code: 0xffff0100

– Clear Event Code: 0xffff0100

Note: This condition is not required after it starts the clear correlation; you must clear this condition as soon as the clear correlation is completed. Using the same clear event as the set event that was used to generate the condition will clear this condition as soon as the clear correlation is completed. Hence, this is a temporary condition; it exists briefly, correlates, and then goes out of existence again.

2. Create the following condition which will be used to actually clear the disk full alarms:

DiskFullAlarmClear:

– Condition Name: DiskProblemAlarmUserCleared

– Set Event Code: 0xffff0002

(This Set Event Code indicates that this event will be generated when the condition is generated by an implied rule.)

– Clear Event Code: 0xffff0002

(This Clear Event Code indicates that the condition will self-clear, as does the condition in Step 1.)

38 Condition Correlation User Guide

Page 47: Condition Correlation User Guide (5175)

DiskFull Example

Create a Rule to Clear DiskFull Alarms

Now you must create a rule which will clear all disk full alarms when a user clears just one of the disk problem alarms.

To create a rule to clear all DiskFull alarms

1. Create the following rule:

Name: DiskFullUserClearRule

Symptom Condition(s):

– DiskProblemAlarmUserCleared

– DiskFull

Type: Exists

Relationship: Implies

Root Cause Condition: DiskFullAlarmClear

Root Cause Target: DiskFull.Model

Note: This helps ensure that the clear event is generated on each model where a disk full alarm exists, so that we can clear that alarm.

2. Save the rule.

3. Add the new rule to the “DiskPolicy” policy.

This rule will now trigger when any one of the three disk problem alarms is cleared by the user, generating event 0xffff0100.

Summary

The setup is complete now. Any time the user clears any of the three disk problem alarms (minor, major, or critical), all individual disk full alarms are cleared as well because the correlation figures any matching set for the DiskFullUserClearRule. Therefore, once the user clears one of the problem alarms, this condition will be paired with each disk full alarm, and will generate the disk full alarm clear condition on the model of the disk full alarm. Thus, all the disk full alarms are cleared.

Condition Correlation Examples 39

Page 48: Condition Correlation User Guide (5175)

DiskFull Example

40 Condition Correlation User Guide

Page 49: Condition Correlation User Guide (5175)

Appendix B: Special Topics

This appendix discusses special topics related to Condition Correlation capabilities and implementation.

Condition Correlation and Fault IsolationWhen a managed device stops responding to polls, the SPECTRUM fault isolation algorithm determines whether to create a critical alarm for the device or suppress its alarm state because another unreachable device is the root cause. In Condition Correlation, you may want to set up a correlation between a device in the “contact lost” condition and some other condition in your environment. For example, SPECTRUM may receive a trap from a BGP (Border Gateway Protocol) router reporting that it lost a session with a peer router. If the peer router is already in the “contact lost” state in SPECTRUM, you may want the BGP session alarm to be a symptom of the contact lost alarm on the peer router model.

If the peer router in the contact lost state has a critical DEVICE HAS STOPPED RESPONDING TO POLLS alarm, the correlation is trivial. However, if the peer router’s fault state has been suppressed by the SPECTRUM fault isolation algorithm, there is no root cause alarm on this model.

Without special consideration by Condition Correlation, it would be impossible for the actual root cause alarm to be correlated with the peer lost alarm. However, in Condition Correlation, the Device Contact Lost condition gets special consideration. This condition will be in play whenever a device is in the contact lost state, regardless of whether the device model is suppressed or alarmed. Furthermore, if the device model in question is suppressed, the correlation engine finds the isolated alarm and uses that as the root cause for any correlation rules.

Special Topics 41

Page 50: Condition Correlation User Guide (5175)

About Transfer Rules

About Transfer RulesSPECTRUM generates a Model Active condition that can be used in a correlation rule when a port model is either added to or removed from a correlation domain. This condition can be used for special rules, like transferring alarms from devices to ports, because the Model Active condition will be present for each correlated port. This eliminates the requirement that the port must have an alarm to participate in a correlation. Attributes on the condition can then be used to create a rule which will transfer alarms to the correct port. The correct port can be identified by using the following parameters from the Model Active condition:

Component OID of the port

Model handle of the port’s parent device

Model type of the port

Condition Correlation provides a default transfer rule: transfer PVCL alarm from device to interface. It reacts to the PVCL failure condition (alarm 0x210048 - PVCLs Failure Notification) on the device model by extracting the Interface ID of the affected port from the PVCL failure condition. It then finds the port model by comparing the Interface ID from the PVCL failure condition to the Component_OID parameter for the Model Active condition on the port. A new PVCL failure (0x210c0c - PVCLs Failure Notification) alarm is created on the port identified by the Model parameter of the Model Active condition. Consequently, the PVCLs Failure Notification alarm for the device is made a symptom of the new PVCLs Failure Notification alarm on the port.

Advanced Correlations and Data Type ComparisonsWhen you configure advanced correlations that involve comparisons between different data types, you must be aware of the following:

Condition Correlation converts the right-hand value to the left. In some cases this may be problematic; it is unlikely that a real number conversion will produce the same text string you have for a comparison.

SNMP represents both real text strings, messages and information for example, and some data, MAC addresses, for example, as octet strings with no indication what the actual usage is. Therefore, in some cases it is simply not possible for the automatic conversion process to convert to the actual type you may need for a comparison because Condition Correlation does not have the meta-information.

Condition Correlation does not attempt to convert list types.

42 Condition Correlation User Guide

Page 51: Condition Correlation User Guide (5175)

Index

A

advanced correlation • 10, 15data type comparisons • 42

C

Caused By, rule relationship • 14

Clear Event Code, condition property • 10

condition • 2advanced correlation parameters • 2, 10creating • 9deleting • 12editing • 12

condition correlationimplementation examples • 25, 41

Condition Correlation Editor, opening • 5

Condition Name, condition property • 9

condition, propertiesclear event code • 10Condition Name • 9set event code • 9

contact lost condition • 41

D

data type comparisons, caveats • 42

discriminator, condition parameters • 11

domain • 4add and remove resources • 22creating • 21deleting • 24editing • 24

E

editingadvanced correlation parameters • 10

F

fault isolation, contact lost condition • 41

I

Implied Cause, rule relationship • 14

Implies, rule relationship • 14

M

Model Active condition • 42

model attribute, condition parameter type • 10

P

Parameter ID, condition parameter property • 10

Parameter ID, typedevice model handle • 11model attribute ID • 10model handle • 11model type handle • 11varbind variable number • 10

Parameter Name, condition parameter property • 10

Parameter Typecondition parameter property • 10

parameter typeoptions

model attribute • 10predefined, device model • 10predefined, model • 10predefined, model type • 10var bind • 10

parameters, advanced correlation • 2, 10

parameters, editing • 10

parameters, propertiesparameter ID • 10parameter name • 10parameter type • 10use as discriminator • 11

policy • 4creating • 17deleting • 19editing • 18

Policy Name, policy property • 18

Index 43

Page 52: Condition Correlation User Guide (5175)

Policy Rules, policy property • 18

policy, propertiespolicy name • 18policy rules • 18

predefined, condition parameter type • 10

R

Relationship, rule property • 14implied cause • 14implies cause • 14

relationship, rule propertycaused by • 14

resourcesadding to domains • 22removing from domains • 22

Root Cause Condition, rule property • 15

root cause target, implies and implied cause relationships • 14

rule • 3creating • 13deleting • 16editing • 15

rule criteria, advanced correlation • 15

Rule Name, rule property • 13

rule, propertiesrelationship • 14root cause condition • 15rule name • 13symptom condition(s) • 14

S

Set Event Code, condition property • 9

Symptom Condition(s)counts, numerical-based criteria • 3, 14does not exist in correlation domain • 3, 14exists in correlation domain • 3, 14

Symptom Condition(s), rule property • 14

T

transfer rules • 42

U

Use as discriminator, condition parameter property • 11

V

Var Bind, condition parameter type • 10

44 Condition Correlation User Guide