ra dev guide

7/27/2019 Ra Dev Guide

1/34

TheOCFResourceAgentDevelopersGuide

FlorianHaas


2/34

TheOCFResourceAgentDevelopersGuideFlorian HaasCopyright 2010 LINBIT HA-Solutions GmbH

License information

The text of and illustrations in this document are licensed under a Creative Commons AttributionShare Alike 3.0 Unported license("CC-BY-SA").

A summary of CC-BY-SA is available at http://creativecommons.org/licenses/by-sa/3.0/.

The full license text is available at http://creativecommons.org/licenses/by-sa/3.0/legalcode.

In accordance with CC-BY-SA, if you distribute this document or an adaptation of it, you must provide the URL for the original version.
http://creativecommons.org/licenses/by-sa/3.0/legalcodehttp://creativecommons.org/licenses/by-sa/3.0/


3/34

iii

1. Introduction ........ ......... ........ ........ ........ ........ ........ ........ ........ ......... ........ ........ ........ .. 11.1. What is a resource agent? ......... ........ ........ ........ ........ ........ ........ ......... ........ .... 11.2. Who or what uses a resource agent? ... .... .... .... ... .... .... .... .... .... .... ... .... .... .... .... .. 11.3. Which language is a resource agent written in? ..... .... .... .... .... .... .... .... .... .... .... .... 1

2. API definitions ........ ........ ........ ........ ........ ......... ........ ........ ........ ........ ........ ........ ........ 22.1. Environment variables ...... ........ ........ ........ ........ ........ ......... ........ ........ ........ ..... 2

2.2. Actions ........................................................................................................ 22.3. Timeouts ...................................................................................................... 32.4. Metadata ..................................................................................................... 3

3. Return codes ........................................................................................................... 53.1. OCF_SUCCESS (0) ....................................................................................... 53.2. OCF_ERR_GENERIC (1) ............................................................................... 53.3. OCF_ERR_ARGS (2) ..................................................................................... 53.4. OCF_ERR_UNIMPLEMENTED (3) .................................................................... 53.5. OCF_ERR_PERM (4) ................................................................ . . . .. . . . .. . . . .. . . . .. . 63.6. OCF_ERR_INSTALLED (5) ........................................................................... 63.7. OCF_ERR_CONFIGURED (6) .................................................................. .. .. .. . 63.8. OCF_NOT_RUNNING (7) ...................................................................... . .. . .. . .. 6

3.9. OCF_RUNNING_MASTER (8) ......................................................................... 63.10. OCF_FAILED_MASTER (9) ......................................................................... 64. Resource agent structure .. ........ ........ ........................................................................ 8

4.1. Resource agent interpreter ............................................................................. 84.2. Author and license information ....................................................................... 84.3. Initialization .................................................................................................. 84.4. Functions implementing resource agent actions ................................................. 94.5. Execution block ............................................................................................ 9

5. Resource agent actions ............ ............................................................................... 105.1. start action ............................................................................................. 105.2. stop action ............................................................................................... 105.3. monitor action ......................................................................................... 125.4. validate-all action ............................................................................... 13

5.5. meta-data action ....... ........ .... .................................................................. 135.6. promote action ......................................................................................... 145.7. demote action ........................................................................................... 155.8. migrate_to action ................................................................................... 165.9. migrate_from action .. .... ......................................................................... 175.10. notify action ...... ........ ..... ...................................................................... 18

6. Script variables ...................................................................................................... 206.1. $OCF_ROOT ............................................................................................... 206.2. $OCF_FUNCTIONS_DIR ............................................................................. 206.3. $OCF_RESOURCE_INSTANCE ..................................................................... 206.4. $__OCF_ACTION ....................................................................................... 206.5. $__SCRIPT_NAME ..................................................................................... 20

6.6. $HA_RSCTMP ............................................................................................. 207. Convenience functions ........................................................................................... 21

7.1. Logging: ocf_log ...................................................................................... 217.2. Testing for binaries: have_binary and check_binary ................................ 217.3. Executing commands and capturing their output: ocf_run .............................. 217.4. Locks: ocf_take_lock and ocf_release_lock_on_exit ....................... 227.5. Testing for numerical values: ocf_is_decimal ............................................ 227.6. Testing for boolean values: ocf_is_true .................................................... 227.7. Pseudo resources: ha_pseudo_resource ................................................... 23

8. Special considerations ............................................................................................. 248.1. Licensing .................................................................................................... 248.2. Locale settings . .... .... ................................................................................... 248.3. Testing for running processes ....................................................................... 248.4. Specifying a master preference ..................................................................... 25

9. Testing, installing, and packaging resource agents ........ ........ ........ ........ ........ ........ ...... 27


4/34

The OCF Resource AgentDevelopers Guide

iv

9.1. Testing resource agents ........ ........ ........ ........ ......... ........ ........ ........ ........ ...... 279.2. Installing resource agents .... .... .... ... .... .... .... .... .... ... .... .... .... .... .... ... .... .... .... .... 279.3. Packaging resource agents ....... ........ ........ ........ ........ ........ ........ ......... ........ ... 28

9.3.1. RPM packaging ........ ........ ........ ........ ........ ......... ........ ........ ........ ....... 289.3.2. Debian packaging ........ ........ ......... ........ ........ ........ ........ ........ ........ .... 28

9.4. Submitting resource agents .... .... .... .... .... .... .... ... .... .... .... .... .... ... .... .... .... .... .... 29


5/34

1

Chapter 1. IntroductionThis document is to serve as a guide and reference for all developers, maintainers, andcontributors working on OCF (Open Cluster Framework) compliant cluster resource agents. Itexplains the anatomy and general functionality of a resource agent, illustrates the resource agentAPI, and provides valuable hints and tips to resource agent authors.

1.1. Whatisaresourceagent?A resource agent is an executable that manages a cluster resource. No formal definition of a clusterresource exists, other than "anything a cluster manages is a resource." Cluster resources can beas diverse as IP addresses, file systems, database services, and entire virtual machines to namejust a few examples.

1.2. Whoorwhatusesaresourceagent?Any Open Cluster Framework (OCF) compliant cluster management application is capable ofmanaging resources using the resource agents described in this document. At the time of writing,two OCF compliant cluster management applications exist for the Linux platform:

Pacemaker, a cluster manager supporting both the Corosync and Heartbeat cluster messagingframeworks. Pacemaker evolved out of the Linux-HA project.

RGmanager, the cluster manager bundled in Red Hat Cluster Suite. It supports the Corosynccluster messaging framework exclusively.

1.3. Whichlanguageisaresourceagentwritten

in?An OCF compliant resource agent can be implemented in anyprogramming language. The API isnot language specific. However, most resource agents are implemented as shell scripts, which iswhy this guide primarily uses example code written in shell language.


6/34

2

Chapter 2. APIdefinitions

2.1. Environmentvariables

A resource agent receives all configuration information about the resource it manages viaenvironment variables. The names of these environment variables are always the name of theresource parameter, prefixed with OCF_RESKEY_. For example, if the resource has an ipparameter set to 192.168.1.1, then the resource agent will have access to an environmentvariable OCF_RESKEY_ip holding that value.

For any resource parameter that is not required to be set by the user that is, its parameterdefinition in the resource agent metadata does not specify required="true" then theresource agent must

Provide a reasonable default. This should be advertised in the metadata. By convention, theresource agent uses a variable named OCF_RESKEY__default that

holds this default.

Alternatively, cater correctly for the value being empty.

In addition, the cluster manager may also support metaresource parameters. These do not applydirectly to the resource configuration, but rather specify how the cluster resource manageris expected to manage the resource. For example, the Pacemaker cluster manager uses thetarget-rolemeta parameter to specify whether the resource should be started or stopped.

Meta parameters are passed into the resource agent in the OCF_RESKEY_CRM_meta_

namespace, with any hypens converted to underscores. Thus, thetarget-roleattribute mapsto an environment variable named OCF_RESKEY_CRM_meta_target_role.

2.2. ActionsAny resource agent must support one command-line argument which specifies the action theresource agent is about to execute. The following actions must be supported by any resourceagent:

start starts the resource.

stop shuts down the resource.

monitor queries the resource for its state.

meta-data dumps the resource agent metadata.

In addition, resource agents may optionally support the following actions:

promote turns a resource into the Master role (Master/Slave resources only).

demote turns a resource into the Slave role (Master/Slave resources only).

migrate_to and migrate_from implement live migration of resources.

validate-all validates a resources configuration.

usage or help displays a usage message when the resource agent is invoked from the

command line, rather than by the cluster manager.

status historical (deprecated) synonym formonitor.


7/34

API definitions

3

2.3. TimeoutsAction timeouts are enforced outside the resource agent proper. It is the cluster managersresponsibility to monitor how long a resource agent action has been running, and terminate it ifit does not meet its completion deadline. Thus, resource agents need not themselves check for

any timeout expiry.

Resource agents can, however, advisethe user of sensible timeout values (which, when correctlyset, will be duly enforced by the cluster manager). See the following section [3] for detailson how a resource agent advertises its suggested timeouts.

2.4. MetadataEvery resource agent must describe its own purpose and supported parameters in a set ofXML metadata. This metadata is used by cluster management applications for on-line help, andresource agent man pages are generated from it as well. The following is a fictitious set ofmetadata from an imaginary resource agent:

0.1

This is a fictitious example resource agent written for the

OCF Resource Agent Developers Guide.

Example resource agent

for budding OCF RA developers

Number of eggs, an example numeric parameter

Number of eggs

Enable superfrobnication, an example boolean parameter

Enable superfrobnication

Data directory, an example string parameter

Data directory


8/34

API definitions

4

The resource-agent element, of which there must only be one per resource agent, definesthe resource agent name and version.

The longdesc and shortdesc elements in resource-agent provide a long and shortdescription of the resource agents functionality. While shortdesc is a one-line description ofwhat the resource agent does and is usually used in terse listings, longdesc should give a full-blown description of the resource agent in as much detail as possible.

Theparameters element describes the resource agent parameters, and should hold any numberofparameter children one for each parameter that the resource agent supports.

Everyparameter should, like the resource-agent as a whole, come with ashortdesc and

a longdesc, and also a content child that describes the parameters expected content.

On the content element, there may be four different attributes:

type describes the parameter type (string, integer, or boolean). If unset, type

defaults to string.

required indicates whether setting the parameter is mandatory (required="true") oroptional (required="false").

For optional parameters, it is customary to provide a sensible default via the defaultattribute.

Finally, the unique attribute (allowed values: true or false) indicates that a specific valuemust be unique across the cluster, for this parameter of this particular resource type. Forexample, a highly available floating IP address is declared unique as that one IP addressshould run only once throughout the cluster, avoiding duplicates.

The actions list defines the actions that the resource agent advertises as supported.

Every action should list its own timeout value. This is a hint to the user what minimaltimeoutshould be configured for the action. This is meant to cater for the fact that some resources arequick to start and stop (IP addresses or filesystems, for example), some may take several minutesto do so (such as databases).

In addition, recurring actions (such as monitor) should also specify a recommended minimum

interval, which is the time between two consecutive invocations of the same action. Liketimeout, this value does not constitute a default it is merely a hint for the user which actioninterval to configure, at minimum.


9/34

5

Chapter 3. ReturncodesFor any invocation, resource agents must exit with a defined return code that informs the callerof the outcome of the invoked action. The return codes are explained in detail in the followingsubsections.

3.1. OCF_SUCCESS(0)

The action completed successfully. This is the expected return code for any successfulstart,stop, promote, demote, migrate_from, migrate_to, meta_data, help, and usage

action.

For monitor (and its deprecated alias, status), however, a modified convention applies:

For primitive (stateless) resources, OCF_SUCCESS from monitor means that theresource is running. Non-running and gracefully shut-down resources must instead return

OCF_NOT_RUNNING.

For master/slave (stateful) resources, OCF_SUCCESS from monitor means that theresource is running in Slave mode. Resources running in Master mode must insteadreturn OCF_RUNNING_MASTER , and gracefully shut-down resources must instead returnOCF_NOT_RUNNING.

3.2. OCF_ERR_GENERIC(1)

The action returned a generic error. A resource agent should use this exit code only when noneof the more specific error codes, defined below, accurately describes the problem.

The cluster resource manager interprets this exit code as a softerror. This means that unlessspecifically configured otherwise, the resource manager will attempt to recover a resource whichfailed with OCF_ERR_GENERIC in-place usually by restarting the resource on the same node.

3.3. OCF_ERR_ARGS(2)

The resource agent was invoked with incorrect arguments. This is a safety net "cant happen"error which the resource agent should only return when invoked with, for example, an incorrectnumber of command line arguments.

Note

The resource agent should not return this error when instructed to perform anaction that it does not support. Instead, under those circumstances, it should returnOCF_ERR_UNIMPLEMENTED.

3.4. OCF_ERR_UNIMPLEMENTED(3)

The resource agent was instructed to execute an action that the agent does not implement.

Not all resource agent actions are mandatory. promote, demote, migrate_to,migrate_from, and notify, are all optional actions which the resource agent may or may not

implement. When a non-stateful resource agent is misconfigured as a master/slave resource, forexample, then the resource agent should alert the user about this misconfiguration by returningOCF_ERR_UNIMPLEMENTEDon the promote and demote actions.


10/34

Return codes

6

3.5. OCF_ERR_PERM(4)The action failed due to insufficient permissions. This may be due to the agent not being able toopen a certain file, to listen on a specific socket, to write to a directory, or similar.

The cluster resource manager interprets this exit code as a harderror. This means that unlessspecifically configured otherwise, the resource manager will attempt to recover a resource whichfailed with this error by restarting the resource on a different node (where the permission problemmay not exist).

3.6. OCF_ERR_INSTALLED(5)The action failed because a required component is missing on the node where the action wasexecuted. This may be due to a required binary not being executable, or a vital configuration filebeing unreadable.

The cluster resource manager interprets this exit code as a harderror. This means that unless

specifically configured otherwise, the resource manager will attempt to recover a resource whichfailed with this error by restarting the resource on a different node (where the required files orbinaries may be present).

3.7. OCF_ERR_CONFIGURED(6)The action failed because the user misconfigured the resource. For example, the user may haveconfigured an alphanumeric string for a parameter that really should be an integer.

The cluster resource manager interprets this exit code as a fatalerror. Since this is a configurationerror that is present cluster-wide, it would make no sense to recover such a resource on a differentnode, let alone in-place. When a resource fails with this error, the cluster manager will attempt toshut down the resource, and wait for administrator intervention.

3.8. OCF_NOT_RUNNING(7)The resource was found not to be running. This is an exit code that may be returned by themonitor action exclusively. Note that this implies that the resource has either gracefullyshutdown, or has never been started.

If the resource is not running due to an error condition, themonitor action should instead returnone of the OCF_ERR_exit codes or OCF_FAILED_MASTER.

3.9. OCF_RUNNING_MASTER(8)The resource was found to be running in the Master role. This applies only to stateful (Master/Slave) resources, and only to their monitor action.

Note that there is no specific exit code for "running in slave mode". This is because their is nofunctional distinction between a primitive resource running normally, and a stateful resourcerunning as a slave. The monitor action of a stateful resource running normally in theSlave roleshould simply return OCF_SUCCESS.

3.10. OCF_FAILED_MASTER(9)

The resource was found to have failed in the Master role. This applies only to stateful (Master/Slave) resources, and only to their monitor action.


11/34

Return codes

7

The cluster resource manager interprets this exit code as a softerror. This means that unlessspecifically configured otherwise, the resource manager will attempt to recover a resource whichfailed with$OCF_FAILED_MASTER in-place usually by demoting, stopping, starting and thenpromoting the resource on the same node.


12/34

8

Chapter 4. ResourceagentstructureA typical (shell-based) resource agent contains standard structural items, in the order as listed inthis section. It describes the expected behavior of a resource agent with respect to the variousactions it supports, using a fictitous resource agent named foobar as an example.

4.1. ResourceagentinterpreterAny resource agent implemented as a script must specify its interpreter using standard"shebang" (#!) header syntax.

#!/bin/sh

If a resource agent is written in shell, specifying the generic shell interpreter (#!/bin/sh) isgenerally preferred, though not required. Resource agents declared as/bin/shcompatible mustnot use constructs native to a specific shell (such as, for example,${!variable} syntax nativeto bash). It is advisable to occasionally run such resource agents through a sanitization utilitysuch as checkbashisms .

It is considered a regression to introduce a patch that will make a previously sh compatibleresource agent suitable only for bash, ksh, or any other non-generic shell. It is, however,perfectly acceptable for a new resource agent to explicitly define a specific shell, such as/bin/bash, as its interpreter.

4.2. AuthorandlicenseinformationThe resource agent should contain a comment listing the resource agent author(s) and/orcopyright holder(s), and stating the license that applies to the resource agent:

#

# Resource Agent for managing foobar resources.

#

# License: GNU General Public License (GPL)

# (c) 2008-2010 John Doe, Jane Roe,

# and Linux-HA contributors

When a resource agent refers to a license for which multiple versions exist, it is assumed that thecurrent version applies.

4.3. InitializationAny shell resource agent should source the.ocf-shellfuncsfunction library. With the syntaxbelow, this is done in terms of$OCF_FUNCTIONS_DIR , which for testing purposes, and alsofor generating documentation may be overridden from the command line.

# Initialization:

: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/resource.d/heartbeat}

. ${OCF_FUNCTIONS_DIR}/.ocf-shellfuncs

Defaults for resource agent parameters should be set by initializing variables with the suffix_default:

# Defaults

OCF_RESKEY_superfrobnicate_default=0

: ${OCF_RESKEY_superfrobnicate=${OCF_RESKEY_superfrobnicate_default}}


13/34

Resource agent structure

9

Note

The resource agent should make sure that it sets a default for any parameter notmarked as required in the metadata.

4.4. FunctionsimplementingresourceagentactionsWhat follows next are the functions implementing the resource agents advertised actions. Theindividual actions are described in detail in Chapter 5, Resource agent actions[10].

4.5. ExecutionblockThis is the part of the resource agent that actually executes when the resource agent is invoked.It typically follows a fairly standard structure:

# Make sure meta-data and usage always succeed

case $__OCF_ACTION in

meta-data) foobar_meta_data

exit $OCF_SUCCESS

;;

usage|help) foobar_usage

exit $OCF_SUCCESS

;;

esac

# Anything other than meta-data and usage must pass validation

foobar_validate || exit $?

# Translate each action into the appropriate function call

case $__OCF_ACTION in

start) foobar_start;;

stop) foobar_stop;;

status|monitor) foobar_monitor;;

promote) foobar_promote;;

demote) foobar_demote;;

reload) ocf_log info "Reloading..."

foobar_start

;;

validate-all) ;;

*) foobar_usage

exit $OCF_ERR_UNIMPLEMENTED

;;

esac

rc=$?

# The resource agent may optionally log a debug message

ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION returned $rc"

exit $rc


14/34

10

Chapter 5. ResourceagentactionsEach action is typically implemented in a separate function or method in the resource agent. Byconvention, these are usually named _ , so the function implementing thestart action in foobarwould be named foobar_start() .

As a general rule, whenever the resource agent encounters an error that it is not able to recover, itis permitted to immediately exit, throw an exception, or otherwise cease execution. Examples forthis include configuration issues, missing binaries, permission problems, etc. It is not necessary topass these errors up the call stack.

It is the cluster managers responsibility to initiate the appropriate recovery action based on theusers configuration. The resource agent should not guess at said configuration.

5.1. startactionWhen invoked with the start action, the resource agent must start the resource if it is not

yet running. This means that the agent must verify the resources configuration, query its state,and then start it only if it is not running. A common way of doing this would be to invoke thevalidate_all and monitor function first, as in the following example:

foobar_start() {

# exit immediately if configuration is not valid

foobar_validate_all || exit $?

# if resource is already running, bail out early

if foobar_monitor; then

ocf_log info "Resource is already running"

return $OCF_SUCCESS

fi

# actually start up the resource here (make sure to immediately

# exit with an $OCF_ERR_ error code if anything goes seriously

# wrong)

...

# After the resource has been started, check whether it started up

# correctly. If the resource starts asynchronously, the agent may

# spin on the monitor function here -- if the resource does not

# start up within the defined timeout, the cluster manager will

# consider the start action failed

while ! foobar_monitor; do

ocf_log debug "Resource has not started yet, waiting"

sleep 1

done

# only return $OCF_SUCCESS if _everything_ succeeded as expected

return $OCF_SUCCESS

}

5.2. stopactionWhen invoked with the stop action, the resource agent must stop the resource, if it is running.

This means that the agent must verify the resource configuration, query its state, and then stop itonly if it is currently running. A common way of doing this would be to invoke thevalidate_all

and monitor function first. It is important to understand that stop is a force operation the


15/34

Resource agent actions

11

resource agent must do everything in its power to shut down, the resource, short of rebootingthe node or shutting it off. Consider the following example:

foobar_stop() {

local rc

# exit immediately if configuration is not validfoobar_validate_all || exit $?

foobar_monitor

rc=$?

case "$rc" in)

"$OCF_SUCCESS")

# Currently running. Normal, expected behavior.

ocf_log debug "Resource is currently running"

;;

"$OCF_RUNNING_MASTER")

# Running as a Master. Need to demote before stopping.

ocf_log info "Resource is currently running as Master"

foobar_demote || \

ocf_log warn "Demote failed, trying to stop anyway"

;;

"$OCF_NOT_RUNNING")

# Currently not running. Nothing to do.

ocf_log info "Resource is already stopped"

return $OCF_SUCCESS

;;

esac

# actually shut down the resource here (make sure to immediately


# wrong)

...

# After the resource has been stopped, check whether it shut down

# correctly. If the resource stops asynchronously, the agent may


# shut down within the defined timeout, the cluster manager will

# consider the stop action failed

while foobar_monitor; do

ocf_log debug "Resource has not stopped yet, waiting"

sleep 1

done


return $OCF_SUCCESS

}

Note

The expected exit code for a successful stop operation is $OCF_SUCCESS, not$OCF_NOT_RUNNING.

Important

A failed stop operation is a potentially dangerous situation which the cluster managerwill almost invariably try to resolve by means of node fencing. In other words,


16/34


12

the cluster manager will forcibly evict from the cluster a node on which a stopoperation has failed. While this measure serves ultimately to protect data, it doescause disruption to applications and their users. Thus, a resource agent should makesure that it exits with an error only if all avenues for proper resource shutdown havebeen exhausted.

5.3. monitoractionThe monitor action queries the current status of a resource. It must discern between threedifferent states:

resource is currently running (return$OCF_SUCCESS);

resource has stopped gracefully (return$OCF_NOT_RUNNING);

resource has run into a problem and must be considered failed (return the appropriate$OCF_ERR_code to indicate the nature of the problem).

foobar_monitor() {

local rc



ocf_run frobnicate --test

# This example assumes the following exit code convention

# for frobnicate:

# 0: running, and fully caught up with master

# 1: gracefully stopped

# any other: error

case "$?" in0)

rc=$OCF_SUCCESS

ocf_log debug "Resource is running"

;;

1)

rc=$OCF_NOT_RUNNING

ocf_log debug "Resource is not running"

;;

*)

ocf_log err "Resource has failed"

exit $OCF_ERR_GENERIC

esac

return $rc

}

Stateful (master/slave) resource agents may use a more elaborate monitoring scheme where theycan provide "hints" to the cluster manager identifying which instance is best suited to assume theMaster role. Section 8.4, Specifying a master preference [25] explains the details.

Note

The cluster manager may invoke the monitor action for a probe, which is a testwhether the resource is currently running. Normally, the monitor operation would

behave exactly the same during a probe and a "real" monitor action. If a specificresource does require special treatment for probes, however, theocf_is_probeconvenience function is available in the OCF shell functions library for that purpose.


17/34


13

5.4. validate-allactionThe validate-all action tests for correct resource agent configuration and a workingenvironment. validate-all should exit with one of the following return codes:

$OCF_SUCCESS all is well, the configuration is valid and usable.

$OCF_ERR_CONFIGURED the user has misconfigured the resource.

$OCF_ERR_INSTALLED the resource has possibly been configured correctly, but a vitalcomponent is missing on the node where validate-all is being executed.

$OCF_ERR_PERM the resource is configured correctly and is not missing any requiredcomponents, but is suffering from a permission issue (such as not being able to create anecessary file).

validate-all is usually wrapped in a function that is not only called when explicitly invokingthe corresponding action, but also as a sanity check from just about any other function.

Therefore, the resource agent author must keep in mind that the function may be invoked duringthe start, stop, and monitor operations, and also during probes.

Probes pose a separate challenge for validation. During a probe (when the cluster manager mayexpect the resource notto be running on the node where the probe is executed), some requiredcomponents may be expectedto not be available on the affected node. For example, this includesany shared data on storage devices not available for reading during the probe. The validate-all function may thus need to treat probes specially, using the ocf_is_probe conveniencefunction:

foobar_validate_all() {

# Test for configuration errors first

if ! ocf_is_decimal $OCF_RESKEY_eggs; then

ocf_log err "eggs is not numeric!"exit $OCF_ERR_CONFIGURED

fi

# Test for required binaries

check_binary frobnicate

# Check for data directory (this may be on shared storage, so

# disable this test during probes)

if ! ocf_is_probe; then

if ! [ -d $OCF_RESKEY_datadir ]; then

ocf_log err "$OCF_RESKEY_datadir does not exist or is not a directory

exit $OCF_ERR_INSTALLED

fi

fi

return $OCF_SUCCESS

}

5.5. meta-dataactionThemeta-dataaction dumps the resource agent metadata to standard output. The output mustfollow the metadata format as specified in Section 2.4, Metadata [3].

foobar_meta_data {cat


18/34


14

0.1

...

EOF

}

5.6. promoteactionThe promote action is optional. It must only be supported by statefulresource agents, whichmeans agents that discern between two distinct roles: Master and Slave. Slave is functionallyidentical to the Started state in a stateless resource agent. Thus, while a regular (stateless)resource agent only needs to implement start and stop, a stateful resource agent must alsosupport the promote action to be able to make a transition between the Started (Slave)and Master roles.

foobar_promote() {

local rc



# test the resource's current state

foobar_monitor

rc=$?

case "$rc" in)

"$OCF_SUCCESS")

# Running as slave. Normal, expected behavior.

ocf_log debug "Resource is currently running as Slave"

;;


# Already a master. Unexpected, but not a problem.

ocf_log info "Resource is already running as Master"

return $OCF_SUCCESS

;;

"$OCF_NOT_RUNNING")

# Currently not running. Need to start before promoting.

ocf_log info "Resource is currently not running"

foobar_start

;;

*)

# Failed resource. Let the cluster manager recover.

ocf_log err "Unexpected error, cannot promote"exit $rc

;;

esac

# actually promote the resource here (make sure to immediately


# wrong)

ocf_run frobnicate --master-mode || exit $OCF_ERR_GENERIC

# After the resource has been promoted, check whether the

# promotion worked. If the resource promotion is asynchronous, the

# agent may spin on the monitor function here -- if the resource# does not assume the Master role within the defined timeout, the

# cluster manager will consider the promote action failed.


19/34


15

while true; do

foobar_monitor

if [ $? -eq $OCF_RUNNING_MASTER ]; then

ocf_log debug "Resource promoted"

break

else

ocf_log debug "Resource still awaiting promotion"sleep 1

fi

done


return $OCF_SUCCESS

}

5.7. demoteactionThe demote action is optional. It must only be supported by statefulresource agents, which

means agents that discern between two distict roles: Master and Slave. Slave is functionallyidentical to the Started state in a stateless resource agent. Thus, while a regular (stateless)resource agent only needs to implement start and stop, a stateful resource agent must alsosupport the demote action to be able to make a transition between the Master and Started(Slave) roles.

foobar_demote() {

local rc



# test the resource's current state

foobar_monitor

rc=$?

case "$rc" in)


# Running as master. Normal, expected behavior.

ocf_log debug "Resource is currently running as Master"

;;

"$OCF_SUCCESS")

# Alread running as slave. Nothing to do.

ocf_log debug "Resource is currently running as Slave"

return $OCF_SUCCESS

;;

"$OCF_NOT_RUNNING")# Currently not running. Getting a demote action

# in this state is unexpected. Exit with an error

# and let the cluster manager recover.

ocf_log err "Resource is currently not running"


;;

*)

# Failed resource. Let the cluster manager recover.

ocf_log err "Unexpected error, cannot demote"

exit $rc

;;

esac

# actually demote the resource here (make sure to immediately


20/34


16


# wrong)

ocf_run frobnicate --unset-master-mode || exit $OCF_ERR_GENERIC

# After the resource has been demoted, check whether the

# demotion worked. If the resource demotion is asynchronous, the

# agent may spin on the monitor function here -- if the resource# does not assume the Slave role within the defined timeout, the

# cluster manager will consider the demote action failed.

while true; do

foobar_monitor

if [ $? -eq $OCF_RUNNING_MASTER ]; then

ocf_log debug "Resource still awaiting promotion"

sleep 1

else

ocf_log debug "Resource demoted"

break

fi

done


return $OCF_SUCCESS

}

5.8.migrate_toactionThe migrate_to action can serve one of two purposes:

Initiate a native push type migration for the resource. In other words, instruct the resourceto move to a specific node from the node it is currently running on. The resource agent

knows about its destination node via the $OCF_RESKEY_CRM_meta_migrate_targetenvironment variable.

Freeze the resource in a freeze/thaw(also known as suspend/resume) type migration. In thismode, the resource does not need any information about its destination node at this point.

The example below illustrates a push type migration:

foobar_migrate_to() {



# if resource is not running, bail out early

if ! foobar_monitor; thenocf_log err "Resource is not running"


fi



# wrong)

ocf_run frobnicate --migrate \

--dest=$OCF_RESKEY_CRM_meta_migrate_target \

|| exit OCF_ERR_GENERIC

...


return $OCF_SUCCESS


21/34


17

}

In contrast, a freeze/thaw type migration may implement its freeze operation like this:

foobar_migrate_to() {



# if resource is not running, bail out early

if ! foobar_monitor; then

ocf_log err "Resource is not running"


fi



# wrong)

ocf_run frobnicate --freeze || exit OCF_ERR_GENERIC

...


return $OCF_SUCCESS

}

5.9. migrate_fromactionThe migrate_from action can serve one of two purposes:

Complete a native push type migration for the resource. In other words, checkwhether the migration has succeeded properly, and the resource is running onthe local node. The resource agent knows about its the migration source via the$OCF_RESKEY_CRM_meta_migrate_source environment variable.

Thaw the resource in a freeze/thaw (also known as suspend/resume) type migration. In thismode, the resource usually not need any information about its source node at this point.

The example below illustrates a push type migration:

foobar_migrate_from() {



# After the resource has been migrated, check whether it resumed



# run within the defined timeout, the cluster manager will

# consider the migrate_from action failed


ocf_log debug "Resource has not yet migrated, waiting"

sleep 1

done


return $OCF_SUCCESS

}

In contrast, a freeze/thaw type migration may implement its thaw operation like this:

foobar_migrate_from() {


22/34


18





# wrong)

ocf_run frobnicate --thaw || exit OCF_ERR_GENERIC

# After the resource has been migrated, check whether it resumed



# run within the defined timeout, the cluster manager will

# consider the migrate_from action failed


ocf_log debug "Resource has not yet migrated, waiting"

sleep 1

done

# only return $OCF_SUCCESS if _everything_ succeeded as expectedreturn $OCF_SUCCESS

}

5.10. notifyactionWith notifications, instances of clones (and of master/slave resources, which are an extended kindof clones) can inform each other about their state. When notifications are enabled, any action onany instance of a clone carries a pre and post notification. Then, the cluster manager invokesthe notify operation on allclone instances. For notify operations, additional environmentvariables are passed into the resource agent during execution:

$OCF_RESKEY_CRM_meta_notify_type the notification type (pre or post)

$OCF_RESKEY_CRM_meta_notify_operation the operation (action) that thenotification is about (start, stop, promote, demote etc.)

$OCF_RESKEY_CRM_meta_notify_start_uname node name of the node where theresource is being started (start notifications only)

$OCF_RESKEY_CRM_meta_notify_stop_uname node name of the node where theresource is being stopped (stop notifications only)

$OCF_RESKEY_CRM_meta_notify_master_uname node name of the node where theresource currently is in the Master role

$OCF_RESKEY_CRM_meta_notify_promote_uname node name of the node wherethe resource currently is being promoted to the Master role (promote notifications only)

$OCF_RESKEY_CRM_meta_notify_demote_uname node name of the node where theresource currently is being demoted to the Slave role (demote notifications only)

Notifications come in particularly handy for master/slave resources using a "pull" scheme, wherethe master is a publisher and the slave a subscriber. Since the master is obviously only available assuch when a promotion has occurred, the slaves can use a "pre-promote" notification to configurethemselves to subscribe to the right publisher.

Likewise, the subscribers may want to unsubscribe from the publisher after it has relinquished its

master status, and a "post-demote" notification can be used for that purpose.

Consider the example below to illustrate the concept.


23/34


19

foobar_notify() {

local type_op

type_op="${OCF_RESKEY_CRM_meta_notify_type}-${OCF_RESKEY_CRM_meta_notify_op

ocf_log debug "Received $type_op notification."

case "$type_op" in

'pre-promote')ocf_run frobnicate --slave-mode \

--master=$OCF_RESKEY_CRM_meta_notify_promote_una

|| exit $OCF_ERR_GENERIC

;;

'post-demote')

ocf_run frobnicate --unset-slave-mode || exit $OCF_ERR_GENERIC

;;

esac

return $OCF_SUCCESS

}

Note

A master/slave resource agent may support a multi-masterconfiguration, wherethere is possibly more than one master at any given time. If that is the case, thenthe $OCF_RESKEY_CRM_meta_notify_*_uname variables may each contain aspace-separated lists of hostnames, rather than a single host name as shown in theexample. Under those circumstances the resource agent would have to properlyiterate over this list.


24/34

20

Chapter 6. ScriptvariablesThis section outlines variables typically available to resource agents, primarily for conveniencepurposes. For additional variables available while the agent is being executed, refer to Section 2.1,Environment variables [2] and Chapter 3, Return codes[5].

6.1. $OCF_ROOTThe root of the OCF resource agent hierarchy. This should never be changed by a resource agent.This is usually /usr/lib/ocf.

6.2. $OCF_FUNCTIONS_DIRThe directory where the resource agents shell function library,.ocf-shellfuncs, resides. Thisis usually defined in terms of$OCF_ROOT and should never be changed by a resource agent. Thisvariable may, however, be overridden from the command line while testing a new or modifiedresource agent.

6.3. $OCF_RESOURCE_INSTANCEThe resource instance name. For primitive (non-clone, non-stateful) resources, this is simply theresource name. For clones and stateful resources, this is the primitive name, followed by a colonan the clone instance number (such as p_foobar:0).

6.4. $__OCF_ACTION

The currently invoked action. This is exactly the first command-line argument that the clustermanager specifies when it invokes the resource agent.

6.5. $__SCRIPT_NAMEThe name of the resource agent. This is exactly the base name of the resource agent script, withleading directory names removed.

6.6. $HA_RSCTMPA temporary directory for use by resource agents. The system startup sequence (on any LSB

compliant Linux distribution) guarantees that this directory is emptied on system startup, so thisdirectory will not contain any stale data after a node reboot.


25/34

21

Chapter 7. Conveniencefunctions

7.1. Logging:ocf_log

Resource agents should use theocf_log function for logging purposes. This convenient loggingwrapper is invoked as follows:

ocf_log "Log message"

It supports following the following severity levels:

debug for debugging messages. Most logging configurations suppress this level by default.

info for informational messages about the agents behavior or status.

warn for warnings. This is for any messages which reflect unexpected behavior that doesnotconstitute an unrecoverable error.

err for errors. As a general rule, this logging level should only be used immediately prior toan exit with the appropriate error code.

crit for critical errors. As witherr, this logging level should not be used unless the resourceagent also exits with an error code. Very rarely used.

7.2. Testingforbinaries:have_binaryandcheck_binary

A resource agent may need to test for the availability of a specific executable. Thehave_binary

convenience function comes in handy here:

if ! have_binary frobnicate; then

ocf_log warn "Missing frobnicate binary, frobnication disabled!"

fi

If a missing binary is a fatal problem for the resource, then thecheck_binary function shouldbe used:

check_binary frobnicate

Using check_binary is a shorthand method for testing for the existence (and executability) ofthe specified binary, and exiting with$OCF_ERR_INSTALLED if it cannot be found or executed.

Note

Both have_binary and check_binary honor $PATH when the binary to testfor is not specified as a full path. It is usually wise to nottest for a full path, as binaryinstallations path may vary by distribution or user policy.

7.3. Executingcommandsandcapturingtheiroutput:ocf_run

Whenever a resource agent needs to execute a command and capture its output, it should use

the ocf_run convenience function, invoked as in this example:

ocf_run "frobnicate --spam=eggs" || exit $OCF_ERR_GENERIC


26/34

Convenience functions

22

With the command specified above, the resource agent will invoke frobnicate --spam=eggs and capture its output and exit code. If the exit code is nonzero (indicating anerror), ocf_run logs the command output with the err logging severity, and the resource agentsubsequently exits.

If the resource agent wishes to capture the output ofboth a successful and a failed command

execution, it can use the -v flag with ocf_run. In the example below, ocf_run will log anyoutput from the command with the info severity if the command exit code is zero (indicatingsuccess), and with err if it is nonzero.

ocf_run -v "frobnicate --spam=eggs" || exit $OCF_ERR_GENERIC

Finally, if the resource agent wants to log the output of a command with a nonzero exit code witha severity otherthan error, it may do so by adding the -info or -warn option to ocf_run:

ocf_run -warn "frobnicate --spam=eggs"

7.4. Locks:ocf_take_lockand

ocf_release_lock_on_exitOccasionally, there may be different resources of the same type in a cluster configurationthat should not execute actions in parallel. When a resource agent needs to guardagainst parallel execution on the same machine, it can use the ocf_take_lock andocf_release_lock_on_exit convenience functions:

LOCKFILE=${HA_RSCTMP}/foobar

ocf_release_lock_on_exit $LOCKFILE

foobar_start() {

...

ocf_take_lock $LOCKFILE

...}

ocf_take_lock attempts to acquire the designated $LOCKFILE. When it is unavailable,it sleeps a random amount of time between 0 and 1 seconds, and retries.ocf_release_lock_on_exit releases the lock file when the agent exits (for any reason).

7.5. Testingfornumericalvalues:ocf_is_decimal

Specifically for parameter validation, it can be helpful to test whether a given value is numeric.

The ocf_is_decimal function exists for that purpose:

foobar_validate_all() {

if ! ocf_is_decimal $OCF_RESKEY_eggs; then

ocf_log err "eggs is not numeric!"

exit $OCF_ERR_CONFIGURED

fi

...

}

7.6. Testingforbooleanvalues:ocf_is_true

When a resource agent defines a boolean parameter, the value for this parameter may be specifiedby the user as 0/1, true/false, or on/off. Since it is tedious to test for all these values fromwithin the resource agent, the agent should instead use theocf_is_trueconvenience function:


27/34

Convenience functions

23

if ocf_is_true $OCF_RESKEY_superfrobnicate; then

ocf_run "frobnicate --super"

fi

Note

Ifocf_is_true is used against an empty or non-existant variable, it always returnsan exit code of1, which is equivalent to false.

7.7. Pseudoresources:ha_pseudo_resource"Pseudo resources" are those where the resource agent in fact does not actually start or stopsomething akin to a runnable process, but merely executes a single action and then needs someform of tracing whether that action has been executed or not. Theportblock resource agentis an example of this.

Resource agents for pseudo resources can use a convenience function,ha_pseudo_resource ,which makes use oftracking filesto keep tabs on the status of a resource. Iffoobarwas designed

to manage a pseudo resource, then itsstart action could look like this:

foobar_start() {



# if resource is already running, bail out early

if foobar_monitor; then

ocf_log info "Resource is already running"

return $OCF_SUCCESS

fi

# start the pseudo resourceha_pseudo_resource ${OCF_RESOURCE_INSTANCE} start

# After the resource has been started, check whether it started up



# start up within the defined timeout, the cluster manager will

# consider the start action failed


ocf_log debug "Resource has not started yet, waiting"

sleep 1

done


return $OCF_SUCCESS

}


28/34

24

Chapter 8. Specialconsiderations

8.1. Licensing

Whenever possible, resource agent contributors are encouragedto use the GNU General PublicLicense (GPL), version 2 and later, for any new resource agents. The shell functions library doesnot strictly mandate this, however, as it is licensed under the GNU Lesser General Public License(LGPL), version 2.1 and later (so it can be used by non-GPL agents).

The resource agent mustexplicitly state its own license in the agent source code.

8.2. Localesettings

When sourcing .ocf-shellfuncsas explained in Section 4.3, Initialization [8], any resourceagent automatically sets LANG and LC_ALL to the C locale. Resource agents can thus expect to

always operate in the C locale, and need not resetLANG or any of the LC_environment variablesthemselves.

8.3. Testingforrunningprocesses

For testing whether a particular process (with a known process ID) is currently running, afrequently found method is to send it a 0 signal and catch errors, similar to this example:

if kill -s 0 `cat $daemon_pid_file`; then

ocf_log debug "Process is currently running"

else

ocf_log warn "Process is dead, removing pid file"rm -f $daemon_pid_file

if

This method has a significant drawback: kill -s 0 does return successfully for zombieprocesses. Zombies, also known as defunct processes, are processes that no longer run but stillhold an entry in the process table. Thus, they must be considered failed resources for all meansand purposes, and for them the kill -s 0 approach yields a misleading, successful, result.

The kill -s 0 approach can employ an additional safeguard (which, however, will work onLinux only):

pid=`cat $daemon_pid_file`if kill -s 0 $pid; then

# Process exists in process table, check its status

if grep -E "State:[[:space:]]+Z $zombie$" /proc/$pid/status; then

ocf_log err "Process is defunct"

# Bail out and let the cluster manager recover


else

ocf_log_debug "Process is currently running"

fi

else

ocf_log warn "Process is dead, removing pid file"

rm -f $daemon_pid_file

if


29/34

Special considerations

25

Important

An approach far superior to both these examples is to instead test the functionalityof the daemon by connecting to it with a client process, as shown in the example inSection 5.3, monitor action [12].

8.4. SpecifyingamasterpreferenceStateful (master/slave) resources must set their own master preference they can thus providehints to the cluster manager which is the the best instance to promote to the Master role.

Important

It is acceptable for multiple instances to have identical positive master preferences.In that case, the cluster resource manager will automatically select a resource agentto promote. However, ifall instances have the (default) master score of zero, thecluster manager will not promote any instance at all. Thus, it is crucial that at least

one instance has a positive master score.

For this purpose, crm_master comes in handy. This convenience wrapper around thecrm_attribute sets a node attribute named master-$OCF_RESOURCE_INSTANCE [20]

for the node it is being executed on, and fills this attribute with the specified value. The clustermanager is then expected to translate this into a promotion score for the corresponding instance,and base its promotion preference on that score.

Stateful resource agents typically execute crm_master during the monitor [12] and/ornotify [18] action.

The following example assumes that thefoobar resource agent can test the applications statusby executing a binary that returns certain exit codes based on whether

the resource is either in the master role, or is a slave that is fully caught up with the master (atany rate, it has current data), or

the resource is in the slave role, but through some form of asynchronous replication has "fallenbehind" the master, or

the resource has gracefully stopped, or

the resource has unexpectedly failed.

foobar_monitor() {

local rc



ocf_run frobnicate --test

# This example assumes the following exit code convention

# for frobnicate:

# 0: running, and fully caught up with master

# 1: gracefully stopped

# 2: running, but lagging behind master

# any other: error

case "$?" in0)

rc=$OCF_SUCCESS


30/34

Special considerations

26

ocf_log debug "Resource is running"

# Set a high master preference. The current master

# will always get this, plus 1. Any current slaves

# will get a high preference so that if the master

# fails, they are next in line to take over.

crm_master -l reboot -v 100

;;1)

rc=$OCF_NOT_RUNNING

ocf_log debug "Resource is not running"

# Remove the master preference for this node

crm_master -l reboot -D

;;

2)

rc=$OCF_SUCCESS

ocf_log debug "Resource is lagging behind master"

# Set a low master preference: if the master fails

# right now, and there is another slave that does

# not lag behind the master, its higher master# preference will win and that slave will become

# the new master

crm_master -l reboot -v 5

;;

*)

ocf_log err "Resource has failed"


esac

return $rc

}


31/34

27

Chapter 9. Testing,installing,andpackagingresourceagents

This section discusses what to do with your resource agent once it is done how to test it,where to install it, and how to include it in either your own application package or in the Linux-HA resource agents repository.

9.1. Testingresourceagents

The resource agents repository (and hence, any installed resource agents package) containsa utility named ocf-tester. This shell script allows you to conveniently and easily test thefunctionality of your resource agent.

ocf-tester is commonly invoked, as root, like this:

ocf-tester -n [-o = ... ]

is an arbitrary resource name.

You may set any number of= with the -o option, corresponding to anyresource parameters you wish to set for testing.

is the full path to your resource agent.

When invoked, ocf-tester executes all mandatory actions and enforces action behavior asexplained in Chapter 5, Resource agent actions[10].

It also tests for optional actions. Optional actions must behave as expected when advertised, butdo not cause ocf-tester to flag an error if not implemented.

Important

ocf-tester does not initiate "dry runs" of actions, nor does it create resourcedummies of any kind. Instead, it exercises the actual resource agent as-is, whetherthat may include opening and closing databases, mounting file systems, starting orstopping virtual machines, etc. Use with care.

For example, you could run ocf-tester on the foobar resource agent as follows:

# ocf-tester -n foobartest \

-o superfrobnicate=true \

-o datadir=/tmp \

/home/johndoe/ra-dev/foobar

Beginning tests for /home/johndoe/ra-dev/foobar...

* Your agent does not support the notify action (optional)

* Your agent does not support the reload action (optional)

/home/johndoe/ra-dev/foobar passed all tests

9.2. Installingresourceagents

If you choose to include your resource agent in your own project, make sure it installs into

the correct location. Resource agents should install into the /usr/lib/ocf/resource.d/ directory, where is the name of your project or any other name youwish to identify the resource agent with.


32/34

Testing, installing, andpackaging resource agents

28

For example, if your foobar resource agent is being packaged as part of a project namedfortytwo, then the correct full path to your resource agent would be /usr/lib/ocf/resource.d/fortytwo/foobar . Make sure your resource agent installs with 0755 (-rwxr-xr-x) permission bits.

When installed this way, OCF-compliant cluster resource managers will be able to properly

identify, parse, and execute your resource agent. The Pacemaker cluster manager, for example,would map the above-mentioned installation path to the ocf:fortytwo:foobar resourcetype identifier.

9.3. PackagingresourceagentsWhen you package resource agents as part of your own project, you should apply theconsiderations outlined in this section.

Note

If you instead prefer to submit your resource agent to the Linux-HA resource agents

repository, see Section 9.4, Submitting resource agents[29] for informationon doing so.

9.3.1. RPMpackaging

It is recommended to put your OCF resource agent(s) in an RPM sub-package, with the name-resource-agents . Ensure that the package owns its provider directory, anddepends on the upstream resource-agents package which lays out the directory hierarchyand provides convenience shell functions. An example RPM spec snippet is given below:

%package resource-agents

Summary: OCF resource agent for Foobar

Group: System Environment/BaseRequires: %{name} = %{version}-%{release}, resource-agents

%description resource-agents

This package contains the OCF-compliant resource agents for Foobar.

%files resource-agents

%defattr(755,root,root,-)

%dir %{_prefix}/lib/ocf/resource.d/fortytwo

%{_prefix}/lib/ocf/resource.d/fortytwo/foobar

Note

If an RPM spec file contains a %package declaration, then RPM considers this asub-package which inherits top-level fields such asName,Version,License, etc.Sub-packages have the top-level package name automatically prepended to theirown name. Thus the snippet above would create a sub-package named foobar-resource-agents (presuming the package Name is foobar).

9.3.2. Debianpackaging

For Debian packages, like for RPMs [28], it is recommended to create a separate packageholding your resource agents, which then should depend on the cluster-agents package.

NoteThis section assumes that you are packaging with debhelper.


33/34


29

An example debian/control snippet is given below:

Package: foobar-cluster-agents

Priority: extra

Architecture: all

Depends: cluster-agents

Description: OCF-compliant resource agents for Foobar

You will also create a separate .install file. Sticking with the example of installing thefoobar resource agent as a sub-package offortytwo, the debian/fortytwo-cluster-agents.install file could consist of the following content:

usr/lib/ocf/resource.d/fortytwo/foobar

9.4. SubmittingresourceagentsIf you choose not to bundle your resource agent with your own package, but instead wish tosubmit it to the upstream resource agent repository hosted on the Linux-HA Mercurial server

[http://hg.linux-ha.org/agents], please follow the steps outlined in this section.

Create a working copy (a Mercurial clone) of the upstream repository with the followingcommand:

hg clone http://hg.linux-ha.org/agents resource-agents

Create a new Mercurial queue, and a new patchset:

cd resource-agents

hg qinit

hg qnew --edit foobar-ra

In your patch message, be sure to include a meaningful description, for example:

High: foobar: new resource agent

This new resource agent adds functionality to manage a foobar service.

It supports being configured as a primitive or as a master/slave set,

and also optionally supports superfrobnication.

Then, copy your resource agent into the heartbeat subdirectory:

cd heartbeat

cp /path/to/your/local/copy/of/foobar .

chmod 0755 foobar

hg add foobar

cd ..

Next, modify the Makefile.am file in resource-agents/heartbeat and add your newresource agent to the ocf_SCRIPTS list. This will make sure the agent is properly installed.

Lastly, open Makefile.am in resource-agents/doc and add ocf_heartbeat_.7to the man_MANS variable. This will automatically generate a resource agent manual page fromits metadata, and then install that man page into the correct location.

Once all that is done, you can update your patch set:

hg qrefresh

Now the patch set is good for review on the mailing list:

hg email [email protected] foobar-ra
http://hg.linux-ha.org/agentshttp://hg.linux-ha.org/agentshttp://hg.linux-ha.org/agentshttp://hg.linux-ha.org/agentshttp://hg.linux-ha.org/agentshttp://hg.linux-ha.org/agents


34/34


Once your new resource agent has been accepted for merging, one of the upstream developerswill push your patch into the upstream repository. At that point, you can update your checkoutfrom upstream, and remove your own patch set.

hg qpop -a

hg pull --update

hg qdelete foobar-ra

ra dev guide

Documents