it risk management, planning and mitigation tcom 5253/msis 4373

40
(c) 2007 Charles G. Gray 1 IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373 Business Continuity Planning 6 December 2007 Charles G. Gray

Upload: caden

Post on 23-Jan-2016

48 views

Category:

Documents


0 download

DESCRIPTION

IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373. Business Continuity Planning 6 December 2007 Charles G. Gray. Business Continuity and Disaster Recovery. Business Continuity - Continuation of the “business” (revenue-generation) in the face of any unusual or unforeseen event - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 1

IT Risk Management, Planning and Mitigation

TCOM 5253/MSIS 4373

Business Continuity Planning6 December 2007

Charles G. Gray

Page 2: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 2

Business Continuity and Disaster Recovery

• Business Continuity - Continuation of the “business” (revenue-generation) in the face of any unusual or unforeseen event– Overall identification of potential events and

the predicted impact on the organization

• Disaster – an event that causes significant damage to business operations and requires some actions to recover

Page 3: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 3

DR vs. BCP• Disaster recovery is no longer enough

• Business operations must be sustained – Legal requirements– Cash flow– Customer retention

• Business continuity is the first priority – then disaster recovery

Page 4: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 4

Business Continuity Planning• An exercise in risk management• Not a “revenue producing” activity

– Business overhead (“cost of doing business”)

• A form of business insurance justified on losses that might occur

• Adequate budgets must be planned– Money– Staff– Time

Page 5: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 5

Key Components of Business Operations

• People

• Equipment

• Workplace

• Suppliers

• Logistics

• Finance

Page 6: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 6

Disaster Examples• Fire• Flood• Malicious damage• Theft• Terrorism• Sabotage• Explosion• Chemical spill• Gas leak

• Disease• Earthquake• Tropical storm• Biological agent• Hostage situation • Threat of action• Criminal damage• Accidental damage• Fault or failure

Page 7: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 7

Key Factors Affected by a Disaster• Financial• Reputation

– Business (Tylenol, Arthur Anderson)– Personal

• NYC Mayor Giuliani• Enron CEO Ken Lay

• Customer service• National security• Health and safety

– Employees– General public

• Regulatory

Page 8: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 8

What is the Cost of Downtime?• Productivity

– Number of employees times loaded pay rate

• Damaged reputation– Customers– Suppliers and business partners– Banks and financial markets

• Revenue– Direct loss, billing losses– Compensatory payments– Loss of future revenue

Page 9: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 9

What is the Cost of Downtime?• Financial performance

– Revenue recognition– Cash flow– Lost discounts (Accounts payable)– Credit rating– Stock price

• Other expenses– Temporary employees, equipment rental,

overtime costs, extra shipping, travel expenses, legal obligations

Page 10: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 10

Examples of Downtime Costs

• Energy $2.8 M per hour

• Telecommunications $2.1

• Manufacturing $1.6

• Finance/brokerage $1.5

• Info Technology $1.3

• Insurance $1.2

• Retail $1.1

• Pharmaceuticals $1.1Source – Meta Group 2006

Page 11: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 11

The Ultimate Cost of Downtime

• 80% of businesses that suffer a major disruption fail within 18 months (Financial Times 18 April 2007)

• Most disruptions are relatively mundane– Drilling through an outside power cable– Failure of air conditioning– “Banana skins” – business slips that result in

loss of customers

Page 12: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 12

BCP and IT• IT facilitates the majority of key business

processes in a modern company– IT systems control the workflow, production,

shipping, billing, customer service (?), etc.– Even the simplest operations can fail when

“the computer is down”

• IT is a strong management tool– Anything with costs associated with it is

tracked for audit and control

Page 13: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 13

Disaster Recovery• Implementation of a response to a specific

type of event– A plan with supporting infrastructure, which is

implemented in the event of a disaster

• Usually treated as an “add on”– Tested occasionally, but rarely emphasized– Financial considerations (CBA)

• Cost of downtime vs. cost of system resilience

Page 14: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 14

“Gap” Analysis• Lack of knowledge transfer between business

continuity and technical disaster recovery• IT security and physical security operate

autonomously• No clear quantitative methodology to rate and

benchmark• Health and safety issues not integrated into the

business• Continuity planning is isolated

– No senior-level champion– Not integrated throughout the business

Page 15: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 15

IT/Business Boundary• IT segmented apart from the “business”

– Creators of technology on one side, users on the other

• Business analysts, project managers and “relationship” managers are expected to bridge the gap

• The business may duplicate some IT support functions to gain some “control”– IT may not even know about it

• Highly inefficient

Page 16: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 16

Cultural Issues - Mistrust• Business tells IT that the requirement was

misunderstood

• Business rejects the technology as not working

• Business realizes their error, to “save face” accepts the technology but does not implement

• Realize their error and try to negotiate

• Find any other way possible to “save face”

Page 17: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 17

Relationship to BCP• BCP is about building a solid and resilient

organization that can deal with difficult circumstances or situations

• Organization must be designed with business continuity in mind – not “bolted on” later– Ugly to look at– Difficult to manage– Costly!

Page 18: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 18

Health and Safety

• Most important business continuity indicators

• People are the principal asset of any business – without them, nothing happens

• Most companies comply with the “letter of the law” – even if they don’t understand what the law is trying to effect

• Companies are responsible for doing all they can to provide a safe workplace

Page 19: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 19

Not Just Fire Anymore• Fire escapes are needed, but that’s not all

– Think about emergency slides (airplanes)

• Terrorism• Natural disaster (global warming??)

– Tropical storms, tornados, tsunami, etc.

• Workplace must be designed for protection and evacuation– Flying glass is the biggest cause of injury– ADA compliance (rules on access, but not

egress)

Page 20: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 20

Terrorism• Direct loss of life

• General economic impact– “Multiplier” effect (trickle-down)

• A company with 10,000 employees may influence $1B in indirect community economic impact

– Salaries, goods, services, taxes

• Mere threat of direct and indirect impact

• Psychological effect on employees– Highest impact on business continuity is

employee perception and panic

Page 21: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 21

Risk, Motivation and CBA• In failing to protect against a disaster that

could be foreseen, is a company being negligent?

• When acts of terror can strike any business at any time, is there not a predictable risk to ALL businesses?

• What is the cost of lost business, loss of reputation or loss of life?

• Are not all businesses bound to protect employees against such events?

Page 22: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 22

Key Issues (1)• Business continuity measures are typically

reactive – need to be more proactive

• No standard approach to business continuity across organizations/industries

• Organizations are not designed with business continuity in the forefront

• The threat of terrorism needs to be addressed more specifically when planning for business continuity

Page 23: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 23

Key Issues (2)• Focus needs to be put on people as the core

asset of the organization• Organizations need to be motivated toward

better continuity preparation, security and health and safety

• A means of financially justifying these or even more comprehensive measures must be found

• Insurers need to cooperate with industry to ensure that individuals, economies and national security are better protected

Page 24: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 24

Communication

Security & Safety

Quality Assurance

Governance and Strategy

Management

Rationalization

Risk Reduction

Rating

Rigor

Robustness

Resilience

Recovery

The Continuity Assurance Framework

Iterative Process

Page 25: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 25

Continuity Assurance Methodology• Strategy sets the direction

• Governance is the navigation that keeps us on course

• Management controls the day-to-day operation of the continuity assurance machine

• QA measures progress in terns of achievement– Interfaces across and around all other functions

Page 26: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 26

The “Machine” Model• Seven levels of quality (continuity)

assurance are the spokes in the wheel

• The hub and spokes of the wheel are encircled by a ring of security and safety

• Encircling all of the elements is communication and knowledge transfer

Page 27: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 27

Core Methodology• Rationalization

• Risk Reduction

• Rating

• Rigor

• Robustness

• Resilience

• Recovery

Page 28: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 28

Rationalization• First step on the path to continuity assurance

– If the foundation is wrong the whole method is undermined

• Rationalize the organization to harmonize security, continuity and recovery functional areas

• Review of processes to avoid overlap• Ensure that business continuity is integrated into

the organization rather than “bolted on”

Page 29: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 29

Risk Reduction• Identification of risks to the business

• Measures the organization determines to put in place to reduce each risk identified

• Eliminate as many risks as possible in order to accurately rate true criticality of processes, people, and systems

Page 30: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 30

Rating• Rating of people, processes and systems

to ensure that the organization is aware of its critical components and assets– You may not even know what the components

are!

• Must understand the business structure before looking at individual components in detail

• Expose weaknesses

Page 31: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 31

Rigor of Process• Processes must be in place to manage

component configuration– Configuration/change management– Very few organizations have adequate

controls

• Rating should identify business areas that require reinforced/improved processes

• Identify which supporting systems need to be reinforced or made more robust

Page 32: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 32

Robustness of Architecture• Determine the vulnerabilities in the

infrastructure and take an integrated architectural approach to correction

• Exercise control of the environment to safely manage any fundamental changes to the architecture– Proceed cautiously!

• Make sure underlying architecture is sound so as to not replicate something less than ideal

Page 33: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 33

Resilience

• Once the underlying systems architecture is strengthened, add new levels of insular resilience to the critical components

• Applies to more than just IT systems – includes people– Need the information that systems AND people

have for business continuity

• Geographic diversity can avoid having to go to “recovery” from a localized event

Page 34: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 34

Resilience–Sun Microsystems

• Championed by a senior executive at HQ

• Plan “owned” by business units

• Ask “what is most critical to the business?”– Why are we doing this?– Will this work in the event of a catastrophe?

• Plan must be simple and workable– Simulation/dry run/dress rehearsal is a

necessity• You may be amazed at the glitches discovered

Page 35: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 35

Recovery• Recovery is what is left after all else failed

because the “event” was to widespread or severe– If you have been successful at all of the

previous levels then recovery will be necessary only in the most severe circumstances

• Recovery process has its own set of risks

Page 36: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 36

Success of the Framework• Iterative process

– Continuous improvement– Revisit each level to tweak their capabilities

• Each level builds on the previous levels

– Holistic view of the organization– Employ new capabilities in response to the

ever-changing business environment

• Key performance indicators (KPI) at every level

Page 37: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 37

Continuity Rating

• Continuity Assurance Achievement Rating (CAAR)– Overall rating of all KPIs across all levels of

the model– Measure of overall business continuity

capability

Page 38: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 38

Solving Continuity Problems

• Root cause analysis– Pareto charts

• The “80/20 rule”• The “trivial many, and vital few”

– Fishbone (Ishikawa) process• Cause and effect diagrams• Systematically list all of the different causes that

can be attributed to a specific problem

• Ask the “why” question five levels down

Page 39: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 39

Communication• Changes needed to truly incorporate business

continuity processes are traumatic– The only thing worse may be a merger

• Consistent and complete communication across the organization is imperative

• Akin to PR and marketing• Must have “buy-in” from top to bottom

– Everyone becomes part of the solution

• Demanding task requiring full-time resources and materials

Page 40: IT Risk Management, Planning and Mitigation TCOM 5253/MSIS 4373

(c) 2007 Charles G. Gray 40

Summary• Business continuity capabilities are not

simple and may require fundamental change across the entire organization

• Disaster comes not only in random events– Can be planned by some and thrust upon

others– Not just natural and indiscriminate but can be

orchestrated and targeted

• Business must orchestrate responses and target defenses to maintain safety, security, and overall continuity