info-tech research group1 building the disaster recovery team

13
Info-Tech Research Group 1 Building the Disaster Recovery Team

Upload: jennifer-phelps

Post on 25-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 1

Building the Disaster Recovery Team

Page 2: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 2

Its not the team itself… they will do fine. They have been through multiple tests with me, and I have confidence in them. What I can’t test is our management. My real fear is not the team, but the outside influences, and how they might react during the crisis. What we really need is hands-off and let us work our plan.

Robert Pierce, Director of IT Security,Carolina Health Care

Your DR efforts will be greatly enhanced by designating an experienced project manager as the DR Coordinator.

Have a dedicated DR Coordinator in place to manage recovery efforts, organize personnel, and synchronize relief efforts

• The DR Coordinator must know their role. The primary responsibilities of the DR Coordinator during the disaster are to manage the DR effort, act as a central point of contact, and shield the DR Team from an external influences during recovery. In many enterprises the IT manager fulfills this role.

• Think fast and act decisively. Changing circumstances require the DR Coordinator to modify the recovery process to respond to the specific needs of the organization. S/he must act decisively in tense situations and exhibit confidence and grace under pressure to reinforce the strength of your team.

• Liaise with external groups. The DR Coordinator must manage all external communications with emergency services, vendors, consultants or key stakeholders.

• Shield the DR Team from external influences during recovery. The DR Coordinator protects the Team from externalities, which may deviate them from the recovery efforts. Don’t let the unprepared factions of the organization impact recovery efforts.

Page 3: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 3

Tailor the plan to the organization by delegating DR roles throughout the organization and having roles clearly defined.

Establish the appropriate number of Disaster Recovery roles to optimize the organization’s DR response

• Use your recovery objectives to determine how many recovery roles your organization is going to need.

• DR teams will look different in different sized organizations. Large organizations will need multiple, technically-specific recovery teams while smaller organizations often field only a single inter-disciplinary recovery team.

• A balanced team allows for controlled and coordinated actions with each member of the team understanding their role in relations to others.

• Run regular recovery control meetings to maintain the DR plan as an organizational priority, and to ensure each member of the DR Team has a clear understanding of their role and its associated tasks.

Recovery Team Roles

Facilities General Responsibility

Hardware Replacement and restoration of all servers and desktops.

Applications/Data Obtaining backups, restoring data, loading software images.

Network Firewall, routers, cabling, equipment to make data available.

External Services Restoring power, Internet, phone.

Engineering Ensuring that the environment is safe and suitable for work.

Finance Provide approval for spending, documentation for insurance.

Human Resources Contact people to report /not report, provide support.

Media Relations Contact media, coordinate information to public.

Management Liaise with boards, make critical decisions, remove obstacles.

Use the DRP Team Build Worksheet to develop your recovery team.

Page 4: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 4

CommunicationChannel

The DR team is your first response unit; make sure you have one in place and that it is appropriately resourced.

Organizations that have a dedicated DR team are more successful at restoring operations

• Your DR Fast Action Response Team is first on site. It is essential that this team is appropriately scoped, sized and staffed. Optimize the balance between size and skill by drawing from logical subsets of the organization, but ensure the size of the unit does not become a hindrance to overall effectiveness.

• Use the DR Coordinator as the point of contact between DR Fast Action Response Team and the Management Stakeholder Group to streamline communications and ensure the onsite assets receive a consistent message.

• Don’t funnel overall organization-wide communication through one or two key people. Delegate communication/coordination responsibilities to team leaders across the organization and have a single Business Liaison interact with management to allow for efficient multi-team, real-time operational information.

• Draw resources from across the organization to share DR plan ownership and balance tasks.

Facilities

IT Applications

IT Infrastructure

IT Storage

DR Fast Action Response

Team DR Coordinator

Production Leader

Business Liaison

Management Stakeholder

Group

Accounting Leader

HR Leader Operations

Leader

CIO

DR Coordinator

Finance

Page 5: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 5

Use the DR Team Build Sheet template to assign personnel to key DR responsibilities & consolidate their contact info

• Have a comprehensive list of all your DR assets with their corresponding responsibilities.

• Use the DR Team Build Sheet template to outline:

◦ Hardware Recovery Team

◦ Hardware Damage Assessment Checklist

◦ Applications/Data Recovery Team

◦ Application/Data Loss Assessment Checklist

◦ Network Recovery Team

◦ Network Component Assessment

◦ Engineering Recovery Team

◦ Finance Team

◦ Media Relations Team

◦ Management Team

◦ …etc.

• You may not require multiple teams, but at least ensure the roles are accounted for.

Build your DR Team out of the best individuals, and over-provision roles, so you’re never person or skill dependant.

Page 6: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 6

If you have an extended outage, the most important thing is keeping money flowing. Have multiple means of payment so you can have the flexibility to focus on the needs of the organization.

Yogi Schulz, Oil & Gas Consultant

Make your DR Fast Action Response Team out of A & B players to balance competence with responsibilities

• Have A players from across all impacted technical units represented on the DR Fast Action Response Team. Also have B players from each impacted unit on standby as alternates.

• Outside of key IT roles (Infrastructure, Applications, Storage etc.) consider Facilities and Finance as fundamental representatives.

• Facilities provides essential access to the physical infrastructure of the organization. In a disaster situation their assistance will be instrumental in getting the organization back to operational.

• Finance is equally integral as they provide access to the financial assets of the organization and streamlines decision-making and resource access procedures.

• Keep Human Resources within arms reach for business continuity. If the disaster has a sustained impact on the organization or is a result of a major regional catastrophe, IT will need to keep communication channels open.

•We’ve found the best way to have skill redundancy is to rotate recovery team members throughout various roles in tests…This allows us to test both primary and secondary team members efficiently…We were also forced to rework our documentation to make sure team members had the information they needed. As a result, now anyone can pick up documentation and step into the recovery role.

Director of Business Continuity, Financial Services

Look for the critical combination of technical acumen, competence, and grace under pressure for primary and secondary members.

Page 7: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 7

Effective disaster recovery is a product of planning, preparation, and execution under pressure, and don’t let Mr. Murphy destroy the plan.

Clearly defining DR responsibilities has the greatest impact on overall DR success

• Beware of Mr. Murphy. The best laid plans of mice and men are often ripped apart by Murphy’s Law. In DR preparation, ensure even the most mundane tasks are addressed and comprehensively accounted for.

• Nothing should be left to chance. Each overlooked aspect of the DR plan creates a weak link plan that could snap at the least opportune moment.

• Recognize common but differentiated responsibilities. Provide each team member clear prioritized instructions so they can work in tandem. Disaster response requires a calculated and coordinated response.

Page 8: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 8

Over provision each role on the DR Team to ensure it is always properly resources, and cross-train members to improve flexibility and efficiency.

Associate DR responsibilities with DR Team roles only and never individuals

• Never be person-dependant. It is very dangerous to isolate DR capabilities in one or two key people. Doing so greatly limits flexibility and could be crippling if those key individuals are indisposed.

• Allocate DR responsibilities to DR roles rather than individuals. Enhance flexibility in your crisis deployment and substitution of personnel as needed.

• Cross-train DR Team members. Prepare each team member to step into at least two other roles on the DR Team. Cross-training enchases deployment flexibility, and provides a redundancy of skills.

• Train everyone on the role of DR Coordinator. As time is often of the essence, it is essential that all members of the team understand and can manage the responsibilities of the DR Coordinator. First on site must be able to step into the role and begin restoration procedures immediately.

IT Director Apps Team LeadInfrastructure Team Lead

Storage Team Lead

DR Fast Action Recovery Team Members

DR Fast Action Recovery Team Roles

DR Coordinator

1.IT Director 2. Asst. IT Director3. Infrastructure lead

Apps Member

1. Apps lead 2. Apps member 3.Storage lead

Infrastructure Member

1. Infrastructure lead2. Infrastructure member 3. Apps lead

Storage Member

1. Storage Lead 2. Storage member 3. Apps lead

We cross-train DR members by rotating 1st string, 2nd string and 3rd string members through DR roles to develop broader skills.

Business Continuity Manager, Professional Services Firm

Page 9: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 9

Over-provision DR roles by developing a comprehensive Method of Procedure so anyone can step into the DR role.

Case Study: Unisource Canada

Industry:Segment:

Source:

Manufacturing WarehousePulp and PaperBrad Kopachynski, Data Centre Operations Manager

• Disaster Recovery Team is made up of day-to-day IT personnel.

• During a disaster, resources will be stretched quite lean.

• There is a potential that members of the DR team may not be able to reach the recovery site efficiently.

• Long-term concern of skill isolation.

Challenge

• Develop comprehensive Method of Procedure (MOP) checklists.◦ Make a master MOP

and MOPs for each role in the DR team.

◦ Review MOP with team quarterly and on any changes.

◦ Make the MOPs skill neutral, with explicit steps and all relevant information needed.

Solution

• Established robust internal process for disaster responses and recovery.

• Anyone can now step into a DR role and fulfill the expectations simply by following the documentation.

• Over-provisioning of skills through rigorous documentation.

Unique Benefit

Page 10: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 10

Implement ‘good judgment’ into the plan and take the subjectivity of failover out by having clear objective triggers. So if it’s 110 degrees for 30 minutes, failover. If not people are going to believe facilities will have things fixed.

Larry Gagler, Business Continuity ManagerHBO

Remove personal judgment from disaster declaration first; once you’re in a DR situation make sure you have your facts straight.

Make DR response crystal clear by having objective trigger points & comprehensive operating procedures

• A DR situation is a tension-filled situation. Help people think clearly in a crisis situation by taking the subjectivity out. Incorporate objective triggers into the DR plan and negate the chance people may make decisions contrary to the organization’s best interests.

• Establish DR trigger points for sequencing of action. Once a disaster has been declared have a prioritized disaster response checklist. Rigorous planning allows for focus and minimizes plan deviation and the potential for expensive mistakes.

• Know where the DR Team is at all times. Each team member must understand the requirements of each DR role to allow for the macro-sequencing of events.

• When downtime is justified must be clearly understood. Depending upon the enterprise, it is often more economical to remain down for a short period of time rather than mobilize the resources necessary to restore operations. This line must be clearly defined and understood by all DR Team members.

Get your facts straight from the beginning. Effective DR management begins with effective information management. Know:

What was the disaster?

When did it occur?

What caused it?

What was the enterprise impact?

What is being done immediately?

What is the estimated fix time?

Is the incident likely again?

Page 11: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 11

Always maintain an up-to-date contact list; without it you’re up the creek without a paddle

• Each organization must have a valid contact list. Often underappreciated by organizations, the call list is integral as a single location holding all vital contacts needed in a DR situation.

• The list must be comprehensive and up-to-date. Your organization’s contact list should contain the contact information for all DR Team members, alternate members, key management stakeholders, recovery vendors and consultants.

• Include all relevant information to contact that personal in the event of a disaster including home and cellular phone/pager numbers, home addresses and emails.

• Leverage multiple messaging options. Consider setting up team communication groups on social media sites, or through alternative/proprietary messaging options to allow for efficient group updates from a single location.

Use the DR Team Build Sheet to develop your Call Tree and Organization Contact List.

• Information is currency… You must have a comprehensive contact list. If not, you will find yourself in a disaster situation without the ability to contact the one key person you need… At that point, you’re up the creek without a paddle.

•Disaster Recovery Consultant, Professional Services Firm

A comprehensive and current contact list is integral to the rapid deployment and mobilization of key DR assets.

Page 12: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 12

Eliminate misinformation and optimize disaster intelligence by diversifying your communication strategy. Prepare for every eventuality.

Open communication channels must be established early & maintained to ensure the free-flow of information

• Consistent and accurate communication is vital to effective DR. Team members must be notified and mobilized. Stakeholder groups must remain in the loop to make strategic decisions and employees must be notified of the disaster. Open communication channels could be the difference between an efficient restoration and a prolonged outage.

• Use conventional communication methods first and foremost. Primary communication lines should include broadcast voicemail, e-mail, instant messaging, faxes and any type of bulletin that can be prepared. Have a recorded phone message informing remote users or clients.

• Have a primary and secondary commutation strategy. If a disaster-level event occurs and all internal communications are offline, have a secondary communication strategy which leverages social media, IM or proprietary messaging networks. This strategy provides flexibility and redundancy to your communication strategy. If available, develop a Smartphone DR application to streamline communications.

Consider using social media or proprietary systems as a means of secondary communication. The likelihood of BlackBerry Messenger (BBM), or Google, or what-have-you going down is exceptionally low. Consider these options for a cheep redundant communication solution.

Disaster Recovery Consultant, Professional Services Firm

Page 13: Info-Tech Research Group1 Building the Disaster Recovery Team

Info-Tech Research Group 13

Have a pre-emptive Incident Response Procedure to open communication channels at the risk of a service impact.

Case Study: HBO

Industry:Segment:

Source:

Media and Entertainment Premium Cable Larry Gagler, Business Continuity Manager

• The enterprise has very demanding service requirements which demand 100% up-time.

• The IT resources are limited, and the DR team is already lean.

• For even potential outages, multiple stakeholders, vendors and clients need to be informed so they can prepare.

Challenge

• Implemented a pre-emptive Incident Response Procedure.◦ Aim to open

communication channels with identified parties even at the risk of a service outage.

◦ Allows for a range of readiness procedures dependant upon the nature of the potential outage.

Solution

• Prepares clients and stakeholders for the potential of an outage.

• Allows for early asset mobilization to optimize recovery time.

• Allows for streamlined disaster communications.

• Limits the frequency of outages.

• Reinforces client confidence in the service.

Unique Benefit