Customer Engagement WorkshopIT Service Continuity
Phoenix, Farnborough17th June 2015
Paul Gant, Head of BCM Assurance
David Davies, BCM Assurance Consultant
Agenda
• 11:00 Registration, refreshments and networking
• 11:30 Why get fit, anyway?
• 11:50 Fictitious live incident
• 12:10 Post incident review
• 12:30 Steps to success
• 12:45 Phoenix ITSC service set
• 12:50 Questions & answers
• 13:00 Lunch, tours, event close
Introducing our BCM Assurance Consultant…
David Davies:
• 12 years continuity management experience, over 70 clients across multiple sectors
• Unique experience: extensive consultancy, IT service continuity management (IBM),
business continuity management (Barclaycard)
• Experience of implementing Business Continuity Standard and Information Security
Standard
• Works to demystify the subject
• Delivers practical advice
Real Recovery (Invocations) is like a Battle
YOUR ENEMIES
• (Lack of) time.
• You can’t recover what you haven’t backed up.
• You can’t upgrade recovery technology during an invocation.
YOUR FRIENDS
• Phoenix.
• Your preparation.
What does “Preparation” involve?
It’s not just about the technology!
But aren’t policies, analysis, plans and reports only there to satisfy to auditor?
Is there any rhyme or reason to them?
Priorities
Dependencies Plans Testing Mainten
ance
IT Service Continuity Management
Focuses on 5 things…
1. What’s needed first?
Sir, is it women and children first…
… or Active Directory and Exchange?
Priorities Dependencies Plans Testing Maintenance
The reccurring dangers that we see
• IT recovery requirements haven’t been agreed with the business (through a BIA).
• IT recovery strategy isn’t joined up (i.e. a full end to end solution isn’t there).
• Strategy isn’t supported by plans and isn’t tested rigorously enough (resulting in inefficiencies and failures during actual recovery).
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
MAIN GATE
VIS
ITO
R C
AR
P
AR
KIN
G
STAFF CAR PARKING
GARDENS GARDENS
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
SIDE GATE(FOOTPATH)
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
100 mbps 100 mbps
1 gbps
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
08:07Fire
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
MAIN GATE
VIS
ITO
R C
AR
P
AR
KIN
G
GARDENS GARDENS
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
STAFF CAR PARKINGSIDE GATE(FOOTPATH)
12:15Servers onsite
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
MAIN GATE
VIS
ITO
R C
AR
P
AR
KIN
G
GARDENS GARDENS
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
08:07Fire
STAFF CAR PARKINGSIDE GATE(FOOTPATH)
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
MAIN GATE
VIS
ITO
R C
AR
P
AR
KIN
G
GARDENS GARDENS
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
12:15Servers onsite
08:07Fire
12:45Exec
Report
STAFF CAR PARKINGSIDE GATE(FOOTPATH)
Warehouse and second server room (ground floor)
Backup SAN and tapes
Offices andServer room 2nd (top) floor
MAIN GATE
VIS
ITO
R C
AR
P
AR
KIN
G
GARDENS GARDENS
12:15Servers onsite
08:07Fire
12:45Exec
Report
CRITICAL SYSTEMS:Recovery Time Objective
24 hours
Recovery Point Objective 24 hours (disk to disk
daily)
NON CRITICAL SYSTEMS:Recovery Time Objective
5 days
Recovery Point Objective 1 day (local tape) and
7 day (offsite tape)
13:15 Start
recovery
STAFF CAR PARKINGSIDE GATE(FOOTPATH)
12:15Servers onsite
08:07Fire
12:45Exec
Report
13:15 Start
recovery
09:30Server
recovered?
11:45Recovery
stalled
Post Incident Review
• What went well? (Where were they fit?)
• what went badly? (Where were they unfit?)
• What could the IT manager have done differently during the recovery?
• What could the IT manager have done differently before the recovery?
IT Service Continuity Issues
Have you experienced any of the issues raised?
• Difficulty in getting board engagement.
• No business requirements for IT recovery (i.e. not BIA).
• Single points of failure in key skills sets.
• Lack of recovery documentation (perhaps no spare time to write it?)
• Lack of formal testing and test reporting.
• Any other issues?
The Barriers and Results
• What’s stopping you / stopped you from making changes?
• What would happen if changes aren’t made and you invoke?
• What would happen if you do make the changes?
The Steps to Successful IT Service Continuity
1. Engagement and sponsorship at a strategic level.
2. Balance between the technology and ITSC management.
3. Do all of ITSC, and run it as a repeating programme.
1. Strategy: Talk the Language of the Business
I need to upgrade the NAS by 5 terabytes and research getting an
enhanced burstable pipe.Err… good for you.
1. Strategy: Talk the Language of the Business
I’m concerned that our IT recovery could be
inadequate until business requirements are confirmed
in a BIA.
At present, our business may struggle to recover
from an IT outage.
What? We need to do something about this.
1. Strategy: Engage with the Executive Team
Does the Executive Team know:
• What are the impacts if IT fails?
• What are the risks associated with IT failure?
• What is the RTO and RPO of services – and what these terms mean.
• What is the recovery and hand back process?
Trap 1: The Scope Trap
I’ve tested Email and Filestore time and again.
I have complete confidence in their recovery.
Great, what about the other 48 IT services?
Trap 2: The Audit Trap
Quick, we need to dust off the plans to satisfy the
auditor.
Then we can forget about ITSC again.
He’ll never know… ha ha!
Trap 3: The Importance and Urgency Trap
We’ve got ten projects going live this quarter.
There’s no time to fully implement and test IT DR,
as it will affect “go live” dates.
Well I suppose we can sort it out later.
We don’t want to get in the way of business strategy.
Trap 4: The Gambler’s (or Optimist’s) Trap
It’ll never happen…
If it does, we’ll be all right provided it happens on a
Monday and I’ve remembered to take the backup tapes home with
me.
Good odds eh?
I’m not bothered, I plan to win the lottery and retire
this week.
Trap 5: The Hero Trap
We’ll all pull together and work extra hours to nail it.
Sleep’s for wimps.
Yeah, it’s nothing that a load of pizza and energy
drinks can’t solve!
ITSC is a management system…
Department and supplier recovery requirements
BIA IT recovery requirements
BC Strategy ITSC Strategy
Plans Plans
Test it Test itRepeat!
Business Continuity Management IT Service Continuity Management
Where does ITSC fit in?
Tech specialist
Technology skill level
Governance / documentation skill level
ITSC consultant
IT manager
IT director
Managed recovery
Scenario exercise
ITSC services
BIA IT recovery gap analysis
IT Service Continuity plan
IT recovery test management
Technical recovery plans
Current State Assessment