suse linux enterprise high availability extension 11: support and troubleshooting
DESCRIPTION
The SUSE Linux Enterprise High Availability Extension provides a solid and well-integrated foundation for deploying highly available services, including Web farms, databases, application servers or virtualization environments.Such distributed setups present unique challenges not encountered on a single node. This tutorial will provide a guide to the most common issues encountered during configuration and design, such as fencing, networking, dependencies and many others.Such distributed setups present unique challenges not encountered on a single node. This tutorial aims to provide a guide to the most common issues during configuration and design: fencing, networking, dependencies, and other aspects.TRANSCRIPT
Supporting SUSE® Linux Enterprise High Availability Extension 11Support and Trouble-shooting
Lars Marowsky-BréeArchitect Storage and [email protected]
© Novell, Inc. All rights reserved.2
Agenda
Introduction
Summary of Cluster Architecture
Common Configuration Issues
Gathering Cluster-wide Support Information
Exploring Effects of Cluster Events
Self-written Resource Agents
Understanding Log Files
Introduction
© Novell, Inc. All rights reserved.4
• SUSE Linux Enterprise Server
• SUSE Linux Enterprise Desktop
• SUSE Linux Enterprise Point of Service
• Extensions
– SUSE Linux Enterprise Real Time
– SUSE Linux High Availability Extension
– SUSE Linux Enterprise Mono Extension
SUSE® Linux EnterpriseFamily
© Novell, Inc. All rights reserved.5
Data Center Challenges
Minimize unplanned downtime
Ensure quality of service
Contain costs
Utilize resources
Effectively manage multiple vendors
Minimize risk
© Novell, Inc. All rights reserved.6
SUSE® Linux Enterprise High Availability ExtensionValue Proposition
• An integrated suite of robust open source clustering technologies that implement highly available physical and virtual services on Linux.
• Used with SUSE Linux Enterprise Server, it helps to maintain business continuity, protect data, and reduce unplanned downtime for all mission critical Linux workloads.
• Used with virtualization, it adds workload based availability and reliability.
© Novell, Inc. All rights reserved.7
SUSE® Linux Enterprise High Availability ExtensionBenefits
Meet service-level agreements
Continuous access to systems and data
Maintain data integrity
Scale-out infrastructure
© Novell, Inc. All rights reserved.8
SUSE® Linux Enterprise High Availability ExtensionKey Features• Service Availability 24/7
– Policy driven clustering> OpenAIS messaging and
membership layer> Pacemaker cluster
resource manager
• Sharing and Scaling Data-access by Multiple Nodes
– Cluster file system> OCFS2> Clustered logical
volume manager
• Disaster Tolerance– Data replication via IP
> Distributed replicated block device
• Scale Network Services– IP load-balancing
• User-friendly Tools– Graphical user interface– Unified command
line interface
© Novell, Inc. All rights reserved.9
SLES 10
Part of SLES 10
OCFS2 / EVMS2
DRBD 0.7
Yast2-HB
Heartbeat
openAIS
Yast2-Multipath
Pacemaker
Added inSLE HA 11
OCFS2general FS
HAGUI
UnifiedCLI
Yast2-DRBD
SLE HA 11
EnhancedData Replication
Web GUI
SambaCluster
Added inSLE HA 11 SP1
Metro-AreaCluster
Cluster ConfigSynchronization
Storage QuorumCoverage
Node Recovery
SLE HA 11 SP1
SUSE® Linux Enterprise High Availability ExtensionHA Stack from 10 to 11
© Novell, Inc. All rights reserved.10
SUSE® Linux Enterprise High Availability ExtensionKey Features in Service Pack 1• Web GUI – Cross platform management• Storage Based Quorum Coverage – Storage device as
a quorum instance• Integrated Samba Clustering – Integration of Samba with
OCFS2 for higher throughput and scale out• Metro-Area Clusters – Clustering between different data
center locations• Cluster-concurrent RAID1 – Improved resilience• Enhance Data Replication – DRBD with Linbit cooperation• Node Recovery – ReaR to recovery server nodes• GFS2 Migration Support – Read-only access to GFS2
for migration
© Novell, Inc. All rights reserved.11
SUSE® Linux Enterprise High Availability ExtensionPricing
Pricing
– x86 and x86_64
> USD 699 per year per server
> Support level inherited from base SUSE Linux Enterprise Server
– Power, Itanium, System z
> Bundled with SUSE Linux Enterprise Server
> Support level inherited from base SUSE Linux Enterprise Server
© Novell, Inc. All rights reserved.12
SUSE® Linux Enterprise High Availability ExtensionPromotion
Existing Customers
– Free of charge subscription
> For all valid SUSE Linux Enterprise Server subscriptions
> Effective date: June 1st 2009
> Valid for subsequent subscription periods if base SUSE Linux Enterprise Server is renewed on time
© Novell, Inc. All rights reserved.13
Upper Node Limit ↔ ↓ ↔ ↔ ↔ ↑ ↔ ↔ ↑↓ ↓ ↓ ↓ ↓ ↔ ↔ ↔ ↑
System Recovery ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↓ ↑Disk Mirroring ↔ ↓ ↓ ↓ ↓ ↑ ↔ ↑ ↑Platform Support ↔ ↓ ↑ ↓ ↑ ↔ ↔ ↔ ↔HW Support ↔ ↓ ↑ ↑ ↑ ↑ ↑ ↑ ↑Storage Support ↔ ↓ ↔ ↔ ↔ ↑ ↑ ↑ ↑ISV Support ↔ ↓ ↑ ↑ ↔ ↔ ↔ ↔ ↔
↔ ↔ ↑ ↑ ↔ ↔ ↓ ↔ ↑GUI ↔ ↔ ↑ ↑ ↑ ↔ ↓ ↔ ↑Command line ↑ ↔ ↑ ↓ ↔ ↔ ↔ ↑ ↑Monitoring ↔ ↔ ↑ ↑ ↑ ↔ ↔ ↑ ↑Documentation ↑ ↑ ↑ ↑ ↔ ↓ ↓ ↔ ↔
HPHP-SG
IBMHACMP
VeritasVCS
MSFTCluster
SteeleyeLifekeeper
RHATAd. Plat.
Novell SLES10
NovellSLE HA 11
NovellSLE HA 11 SP1
Network Load-Balancing
Setup, Installation and Configuration
Area with enhancements in SP1
SUSE® Linux Enterprise High Availability ExtensionCompetitive Landscape
© Novell, Inc. All rights reserved.14
DFS Deutsche Flugsicherung - government-owned German Air Traffic Control
Ensures the availability of critical air traffic control services by Implementing a fail-over solution using clusters of SUSE Linux Enterprise Servers.
Getronics - the largest provider of IT services in the Netherlands
Implemented a cost-effective high availability solution for a web-based customer information system supporting two million customers using SUSE Linux Enterprise Server, SAP, Oracle Real Application Clusters, and IBM System x3850 hardware. When the solution detects a failure in one node, it seamlessly recovers all running processes on the remaining node in its cluster.
La Curacao – one of the top 100 electronics and appliance retailers in the U.S focusing on the Hispanic market
Implemented SUSE Linux Enterprise Server in a clustered environment on HP ProLiant servers to run their mission critical databases and keeps La Curacao's stores running without interruption.
Unitop - one of the largest producers of anionic surfactant chemicals in India.
Implemented a certified high availability SAP ERP solution, using SUSE Linux Enterprise Server, IBM System x hardware, IBM DB2 information management software, and SAP, for all its business activities and information.
SUSE® Linux Enterprise High Availability ExtensionCustomer Examples
Cluster Architecture
© Novell, Inc. All rights reserved.16
3 Node Cluster Overview
Kernel
XenVM1
LAMPApache
IPext3
Kernel Kernel
Corosync + openAIS
Pacemaker
DLM
cLVM2+OCFS2
XenVM2
Network Links
Clients
Storage
© Novell, Inc. All rights reserved.17
ext3, XFS OCFS2
cLVM2
Local Disks SANFC(oE), iSCSI
DRBD Multipath IO
DLM
SCTP TCP UDPmulticast
UDPmulticastEthernet Infiniband
Bonding
Linux Kernel
SAP
MySQL
libvirt
Xen
Apache
iSCSI
Filesystems
IP address
DRBD
clvmd
Ocfs2_controld
dlm_controld
YaST2
cDRBD cOpenAISMPIO
LVS
Res
ourc
e Ag
ents
LSB
init
STO
NIT
HLR
M
...
DRAC
iLO
SBD
Fencing
Web GUI
Python GUI
CRM Shell
CIB PolicyEngine
Pacemaker
OpenAIS
Detailed View of ComponentsPer Node:
© Novell, Inc. All rights reserved.18
Why Is This Talk Necessary?
We heard comments:
• Can't you just make the software stack easy to understand?
• Why is a multi-node setup more complicated than a single node?
• Gosh, this is awfully complicated! Why is this stuff so powerful? I don't need those other features!
This session addresses most of these questions
Design and Architecture Considerations
© Novell, Inc. All rights reserved.20
General Considerations
• Consider the support level requirements of your mission-critical systems.
• Your staff is your key asset!– Invest in training, processes, knowledge sharing.– A good administrator will provide higher availability than a
mediocre cluster setup.
• Get expert help for the initial setup, and• Write concise operation manuals that make sense at
3am on a Saturday ;-)• Thoroughly test the cluster regularly.
– Use a staging system before deploying large changes!
© Novell, Inc. All rights reserved.21
Manage Expectations Properly
• Clustering improves reliability, but does not achieve 100%, ever.
• Clusters are more complex than single nodes.
• Fail-over clusters reduce service outage, but do not eliminate it.
• Clustering broken applications will not fix them.
• Clusters do not replace backups, RAID, or good hardware.
© Novell, Inc. All rights reserved.22
Complexity Versus Reliability
• Every component has a failure probability.– Good complexity: Redundant components.– Undesirable complexity: chained components.– Choke point → single point of failure– Also consider: Administrative complexity.
• Use as few components (features) as feasible.– Our extensive feature list is not a mandatory checklist for
your deployment ;-)• What is your fall-back in case the cluster breaks?
– Backups, non-clustered operation– Architect your system so that this is feasible!
© Novell, Inc. All rights reserved.23
Cluster Size Considerations
• More nodes:– Increased absolute redundancy and capacity.– Decreased relative redundancy.– One cluster → one failure domain.
• Does your work-load scale well to larger node counts?• Chose odd node counts.
– 4 and 3 node clusters both lose majority after 2 nodes.
• Question:– 5 cheaper servers, or– 3 higher quality servers with more capacity each?
Common Setup Issues
© Novell, Inc. All rights reserved.25
General Software Stack
• Please avoid chasing already solved problems!
• Please apply all available software updates:– SUSE® Linux Enterprise Server 11– SUSE Linux Enterprise High Availability Extension
• Consider migrating to SUSE Linux Enterprise High Availability Extension, if you have not already.
– Usability, ease of setup, integration are all much improved.– SUSE Linux Enterprise Server 10 remains fully supported.
© Novell, Inc. All rights reserved.26
From One to Many Nodes
• Error: Configuration files not identical across nodes.– /etc/drbd.conf, /etc/corosync/corosync.conf, /etc/ais/openais.conf,
resource-specific configurations ...
• Symptoms: Causes weird misbehavior, works one but not on other systems, interoperability issues, and possibly others.
• Solution: Make sure they are synchronized.– SUSE® Linux Enterprise High Availability Extension 11 SP1
provides “csync2” to do this automatically for you.> You can add your own files to this list as needed.
© Novell, Inc. All rights reserved.27
Networking
• Switches must support multicast properly.
• Bonding is preferable to using multiple rings:
– Reduces complexity
– Exposes redundancy to all cluster services and clients
• Firewall rules are not your friend.
• Keep firmware on switches uptodate!
• Make NIC names identical on all nodes
© Novell, Inc. All rights reserved.28
Fencing (STONITH)
• Error: Not configuring STONITH at all– It defaults to enabled, resource start-up will block and the
cluster simply do nothing. This is for your own protection.• Warning: Disabling STONITH
– DLM/OCFS2 will block forever waiting for a fence that is never going to happen.
• Error: Using “external/ssh”, “ssh”, “null” in production– These plug-ins are for testing. They will not work in production!– Use a “real” fencing device or external/sbd
• Error: configuring several power switches in parallel.• Error: Trying to use external/sbd on DRBD
© Novell, Inc. All rights reserved.29
CIB Configuration Issues
• 2 node clusters cannot have majority with 1 node failed– # crm configure property no-quorum-policy=ignore
• Resources are starting up in “random” order or on “wrong” nodes
– Add required constraints!
• Resources move around when something “unrelated” changes
– # crm configure property default-resource-stickiness=1000
• # crm_verify -L ; ptest -L -VVVV– Will point out some basic issues
We'llget backto that ...
© Novell, Inc. All rights reserved.30
Configuring Cluster Resources
• Symptom: On start of one or more nodes, the cluster restarts resources!
• Cause: resources under cluster control are also started via the “init” sequence.
– The cluster “probes” all resources on start-up on a node, and when it finds resources active where they should not be – possibly even more than once in the cluster –, the recovery protocol is to stop them all (including all dependencies) and start them cleanly again.
• Solution: Remove them from the “init” sequence.
© Novell, Inc. All rights reserved.31
Setting Resource Time-outs
• Belief: “Shorter time-outs make the cluster respond faster.”
• Fact: Too short time-outs cause resource operations to “fail” erroneously, making the cluster unstable and unpredictable.
– A somewhat too long time-out will cause a fail-over delay;– a slightly too short time-out will cause an unnecessary
service outage.• Consider that a loaded cluster node may be slower
than during deployment testing.– Check “crm_mon -t1” output for the actual run-times
of resources.
© Novell, Inc. All rights reserved.32
OCFS2
• Using ocfs2-tools-o2cb (legacy mode)– O2CB only works with Oracle RAC; full features of SUSE® Linux
Enterprise High Availability Extension are only available in combination with Pacemaker
– # zypper rm ocfs2-tools-o2cb– Forget about /etc/ocfs2/cluster.conf, /etc/init.d/ocfs2, /etc/init.d/o2cb
and /etc/sysconfig/ocfs2• Nodes crash on shutdown
– If you have active ocfs2 mounts, you need to umount before shutdown– If openais is part of the boot sequence
> # insserv openais
• Consider: Do you really need OCFS2?– Can your application really run concurrently?
© Novell, Inc. All rights reserved.33
Distributed Replicated Block Device
• Myth: has no shared state, thus no STONITH needed.– Fact: DRBD still needs fencing!
• Active/Active:– Does not magically make ext3 or applications
concurrency-safe, still can only be mounted once– With OCFS2, split-brain is still fatal, as data diverges!
• Active/Passive:– Ensures only one side can modify data, added protection.– Does not magically make applications crash-safe.
• Issue: Replication traffic during reads.– “noatime” mount option.
© Novell, Inc. All rights reserved.34
Storage in General
• Activating non-battery backed caches for performance
– Causes data corruption.
• iSCSI over unreliable networks.
• Lack of multipath for storage.
• Believing that RAID replaces backups.
– RAID and replication immediately propagate logical errors!
• Please ensure that device names are identical on all nodes.
Exploring the Effect of Events
© Novell, Inc. All rights reserved.36
What Are Events?
• All state changes to the cluster are events– They cause an update of the CIB
– Configuration changes by the administrator
– Nodes going up or going down
– Resource monitoring failures
• Response to events is configured using the CIB policies and computed by the Policy Engine
• This can be simulated using ptest– Available comfortably through the “crm” shell
© Novell, Inc. All rights reserved.37
Simulating Node Failure
hex-0:~ # crm
crm(live)# cib new sandbox
INFO: sandbox shadow CIB created
crm(sandbox)# cib cibstatus node hex-0 unclean
crm(sandbox)# ptest
© Novell, Inc. All rights reserved.38
Simulating Node Failure
© Novell, Inc. All rights reserved.39
crm(sandbox)# cib cibstatus load live
crm(sandbox)# cib cibstatus op
usage: op <operation> <resource> <exit_code> [<op_status>] [<node>]
crm(sandbox)# cib cibstatus op start dummy1 not_running done hex-0
crm(sandbox)# cib cibstatus op start dummy1 unknown timeout hex-0
crm(sandbox)# configure ptestptest[4918]: 2010/02/17_12:44:17 WARN: unpack_rsc_op: Processing failed op dummy1_start_0 on hex-0: unknown error (1)
Simulating Resource Failure
© Novell, Inc. All rights reserved.40
Simulating Resource Failure
© Novell, Inc. All rights reserved.41
Exploring Configuration Changes
crm(sandbox)# cib cibstatus load live
crm(sandbox)# configure primitive dummy42 ocf:heartbeat:Dummy
crm(sandbox)# ptest
© Novell, Inc. All rights reserved.42
Configuration Changes - Woah!
© Novell, Inc. All rights reserved.43
Exploring Configuration Changes
crm(sandbox)# configure rsc_defaults resource-stickiness=1000
crm(sandbox)# ptest
crm(sandbox)# configure order order-42 inf: dummy42 dummy1
crm(sandbox)# ptest
© Novell, Inc. All rights reserved.44
Configuration Changes – Almost ...
© Novell, Inc. All rights reserved.45
Configuration Changes - Done
Log Files and Their Meaning
© Novell, Inc. All rights reserved.47
hb_report Is the Silver Support Bullet
• Compiles– Cluster-wide log files,
– Package state,
– DLM/OCFS2 state,
– System information,
– CIB history,
– Parsed core dump reports, into a single tarball for all support needs.
• # hb_report -n “node1 node2 node3” -f 12:00 /tmp/hb_report_example1
© Novell, Inc. All rights reserved.48
Logging
• “The cluster generates too many log messages!”
– Alas, customers are even more upset when asked to reproduce a problem on their production system ;-)
– Incidentially, all command line invocations are logged.
• System-wide logs: /var/log/messages
• CIB history: /var/lib/pengine/*
– All cluster events are logged here and can be analyzed with hindsight (python GUI, ptest, and the crm shell).
© Novell, Inc. All rights reserved.49
Where Is the Real Cause?
The answer is always in the logs.
Even though the logs on the DC may print a reference to the error, the real cause may be on another node.
Most errors are caused by resource agent misconfiguration.
© Novell, Inc. All rights reserved.50
Correlating Messages to Their Cause
• Feb 17 13:06:57 hex-8 pengine: [7717]: WARN: unpack_rsc_op: Processing failed op ocfs2-1:2_monitor_20000 on hex-0: not running (7)
– This is not the failure, just the Policy Engine reporting on the CIB state! The real messages are on hex-0, grep for the operation key:
• Feb 17 13:06:57 hex-0 Filesystem[24825]: [24861]: INFO: /filer is unmounted (stopped)
• Feb 17 13:06:57 hex-0 crmd: [7334]: info: process_lrm_event: LRM operation ocfs 2-1:2_monitor_20000 (call=37, rc=7, cib-update=55, confirmed=false) not running
– Look for the error messages from the resource agent before the lrmd/pengine lines!
Debugging Resource Agents
© Novell, Inc. All rights reserved.52
Common Resource Agent Issues
• Operations must succeed if the resource is already in the requested state.
• “monitor” must distinguish between at least “running/OK”, “running/failed”, and “stopped”
– Probes deserve special attention
• Meta-data must conform to DTD.• 3rd party resource agents do not belong under
/usr/lib/ocf/resource.d/heartbeat – chose your own provider name!
• Use ocf-tester to validate your resource agent.
© Novell, Inc. All rights reserved.53
ocf-tester Example Output
hex-0:~ # ocf-tester -n Example /usr/lib/ocf/resource.d/bs2010/Dummy
Beginning tests for /usr/lib/ocf/resource.d/bs2010/Dummy...
* Your agent does not support the notify action (optional)
* Your agent does not support the demote action (optional)
* Your agent does not support the promote action (optional)
* Your agent does not support master/slave (optional)
* rc=7: Stopping a stopped resource is required to succeed
Tests failed: /usr/lib/ocf/resource.d/bs2010/Dummy failed 1 tests
Questions and Answers
Unpublished Work of Novell, Inc. All Rights Reserved.This work is an unpublished work and contains confidential, proprietary, and trade secret information of Novell, Inc. Access to this work is restricted to Novell employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of Novell, Inc. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability.
General DisclaimerThis document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Novell, Inc. makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for Novell products remains at the sole discretion of Novell. Further, Novell, Inc. reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All Novell marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.