business continuity plan -...
TRANSCRIPT
Datamatics Technologies Limited SSG – Disaster Recovery Plan
Disaster Recovery Plan --Software Support Group (SSG)
Version 2.1 Preparation Date: 22 Feb 07
Page 1 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
Prepared by: Vilas Almeida, Ritesh Pandya
Reviewed and Approved by: Prasad Ramanathan Signature of Reviewer and Approver : Date: February 22, 2007
Version 2.1 Preparation Date: 22 Feb 07
Page 2 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
Version History :Ver. no.
Authors Date Reviewers Date Particulars / remarks / major changes from previous version
1.0 Vilas Almeida 27 Nov 05 Prasad Ramanathan
31 Dec 05 Initial Version
1.1 Prasad Ramanathan 2 Jan 06 None Made changes necessary to customize the plan for SSG
2.0 Prasad Ramanathan 13 Sept 06 Vilas Almeida
15 Sept 06 Made changes necessary to make the plan conform to Disaster Recovery processes and integrate it with the company’s DR plan
2.1 Ritesh Pandya 22 Feb 07 Prasad Ramanathan
22 Feb 07 Made changes in Critical Resources
Version 2.1 Preparation Date: 22 Feb 07
Page 3 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
Table of Contents
1 INTRODUCTION.................................................................................................................................5
2 OVERVIEW OF BCP..........................................................................................................................5
2.1 OBJECTIVE.....................................................................................................................................52.2 ASSUMPTIONS................................................................................................................................52.3 DEVELOPMENT...............................................................................................................................52.4 TESTING.........................................................................................................................................52.5 MAINTENANCE...............................................................................................................................5
3 DISASTER MANAGEMENT TEAM - ORGANISATION..............ERROR! BOOKMARK NOT DEFINED.
4 GENERAL INFORMATION..............................................................................................................5
4.1 DEPARTMENT MISSION..................................................................................................................64.2 FUNCTIONAL ORGANISATION CHART............................................................................................64.3 EMPLOYEE LIST.............................................................................................................................74.4 KEY NON-DEPARTMENT EMPLOYEES...........................................................................................7
5 NORMAL OPERATING ENVIRONMENT.....................................................................................8
5.1 BUSINESS PROCESSES....................................................................................................................85.2 SKILL MATRIX...............................................................................................................................85.3 VITAL ASSETS................................................................................................................................85.4 CRITICAL DEPENDENCIES..............................................................................................................8
6 SYSTEM RECOVERY PROCEDURES............................................................................................9
6.1 RECOVERY PROCEDURE.................................................................................................................9
7 PREVENTIVE MAINTENANCE.....................................................................................................10
7.1 SKILL MATRIX.............................................................................................................................107.2 BCP PRACTICE EXERCISE............................................................................................................107.3 INTRA GROUP TRAINING.............................................................................................................107.4 INTER GROUP TRAINING.......................................................ERROR! BOOKMARK NOT DEFINED.7.5 TEAM CONTACT DETAILS............................................................................................................10
8 EMERGENCY OPERATING ENVIRONMENT...........................................................................11
8.1 DISASTER CENTRIC IMPACTS.......................................................................................................118.1.1 Fire.........................................................................................................................................118.1.2 Natural Calamities and Man made disasters.........................................................................118.1.3 N/A of Employees...................................................................................................................118.1.4 Hardware Breakdown............................................................................................................118.1.5 Cyber Attack & Computer Virus............................................................................................118.1.6 Telecom link failure................................................................................................................118.1.7 Power failure..........................................................................................................................128.1.8 Failure of AC..........................................................................................................................12
8.2 RESULTANT SCENARIOS...............................................................................................................128.2.1 Minor Breakdown...................................................................................................................128.2.2 Major Breakdown...................................................................................................................12
9 ANNEXURES......................................................................................................................................14
9.1 ANNEXURE 1 : PROCESS DETAILS............................................................................................149.2 ANNEXURE II : INVENTORY OF HARWARE..............................................................................15Version 2.1 Preparation Date: 22 Feb
07 Page 4 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
9.3 ANNEXURE III: Assets Details.................................................................................................16
1 IntroductionSoftware Support Group (SSG) is an important functional group of Datamatics Technologies limited (DTL) located at Technology Centre delivering/ maintaining software tools and providing technology services for the BPO production staff of DTL. SSG is also responsible for delivering enhancements and maintaining enterprise software tools such as the company’s intranet sites and such other tools/ utilities used by various user groups within the company. The output from the SSG team enhances the productivity and quality of outputs from the production team. A sound Business Continuity Plan (BCP)/ Disaster Recovery Plan (DR) is, therefore, necessary.
Since the role of the group is one of supporting internal customers, there is no specific Business Continuity Plan that has been envisaged for the group. In the worst case, if a disaster occurs and the SSG group becomes completely non-functional, activities of the production staff can still continue although at a reduced productivity and quality level. Nevertheless, this document describes the plan for SSG to recover from a disaster.
2 Overview of BCP/ DR
2.1 ObjectiveTo lay down a comprehensive “Disaster Recovery Plan” covering functional activities of the Software Support Group.
2.2 AssumptionsIt is assumed that all disasters related to infrastructure and operational services will be taken care of by appropriate agencies. Further, it is also assumed that the DR plan for the Operations Dept of DTL and the company-wide DR plan kick-in in the event of a disaster. Only after those plans fructify can the DR plan of the SSG become operational.
2.3 DevelopmentSSG Head shall develop the DR with close support from the Operations team located in Knowledge Centre.
2.4 TestingSSG Head shall get the DR tested by conducting practice exercises at least once every calendar year.
2.5 MaintenanceSSG Head shall maintain the DR by constantly updating its contents in respect of role, assets, resources and functional requirements.
Version 2.1 Preparation Date: 22 Feb 07
Page 5 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
3 General Information
3.1 MissionThis group is responsible for delivering and maintaining software tools to be used by the production staff of the BPO in Mumbai, Chennai and Livonia offices. These tools enhance the productivity and quality of the outputs generated by the production staff. Several workflow tools, intranet websites and utility tools are developed and maintained by SSG for use of the HR/ Admin/ Payroll/ Operations groups to improve the productivity and quality of service delivered by these groups.
3.2 Functional Organization Chart
Version 2.1 Preparation Date: 22 Feb 07
Page 6 of 16
MANISH MODICEO
PRASAD RAMANATHANGeneral Manager
(OPEN POSITION)Manager - Software
SHANTILAL NAKRANIAsst. Manager - Software
TEAM LEADS TEAM LEADS
HITESH SHAHManager - Software
TEAM LEADS
TEAM MEMBERS TEAM MEMBERS TEAM MEMBERS
Datamatics Technologies Limited SSG – Disaster Recovery Plan
3.3 Employee List
Team members contact list can be obtained from the following link: http://dtl1/ssg (or http://172.1.254.23/ssg) SSG Dash Board SSG Org Chart. If the machine DTL1 is not functional, then the contact list can be obtained from hard-copies of personnel files maintained by HR at KC.
Following are the critical resources identified for the SSG team for the purposes of Disaster Recovery:
Critical Resources
Sr. No.
Name of Resource Contact Number
Remarks
1. Prasad Ramanathan 25000332/ 93231 94737
2. Shantilal Nakrani 98206 64428/ 2292 2939
3. Mukesh Tank 93245 201644. Mukesh Patel 98674 044825. Nishigandha 98191 036966. Rupesh Bavkar 98193 70724
3.4 Key Non-Department Employees
Name &Address
Contact DetailsOffice Fax Residence Mobile
Manish Modi 67971101 32433009 9867568010TN RadhakrishnaGlobal Head HR602, TIRUPATI APTS,18TH ROADKHAR WMUMBAI 400052
67971156 28343669 28343669 9870055516
Krishan MoorjaniManager (Admin)
Ajay SinghHead of Operations
Version 2.1 Preparation Date: 22 Feb 07
Page 7 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
4 Normal Operating Environment
4.1 Business Processes
Refer to Annexure I for process details.
In normal condition, most of the operations of the SSG group shall be carried out from the Mumbai office (Technology Centre building).
4.2 Skill Matrix
This is being maintained by the Training Cell and periodically updated
4.3 Vital Assets
Refer to Annexure III for Assets
4.4 DependenciesIn the following diagram showing various network connectivities, the only ones that impact the operations of the SSG team are the following:a) LAN connecting various servers and desktops within the KC and TC buildings,
particularly on the TJAM domainb) WAN between the Mumbai office and the DTL Chennai officec) WAN between the Mumbai office and the Livonia office
Version 2.1 Preparation Date: 22 Feb 07
Page 8 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
The following are some of the important assets impacting the SSG group:
Building & Furniture1. Technology Centre housing all the staff2. Knowledge Centre housing the various servers hosting applications including
Prism+ Resources
A team of about 35 trained resources with multiple skill sets. Hardware
Refer to Annexure II for Inventory Communication Lines
2 Mbps Link between KC and Livonia1 Mbps Link between KC and Chennai
5 System Recovery Procedures
“PrismServer” and “SSGVSS” are two server computers that are critical to the work of the SSG team.
1. PrismServer: All important source code/ executables/ installables for software applications are “released” via PrismWorkbench and stored on PrismServer at the following path: \\prismserver\drndbin\rnduser. This is typically done when software development has been completed.
2. SSGVSS: On a daily basis, all source code and supporting files are checked into SSGVSS by each individual from his/ her individual computer.
The SSG SQA person is responsible for backing up the contents of \\prismserver\drndbin\rnduser and \\ssgvss\ssg on a weekly basis on DVD’s. Daily incremental backups of \\prismserver\drndbin\rnduser are taken on \\ssgvss\d$\PrismServerBackup. Similarly, daily incremental backups of \\ssgvss are maintained on \\fcub026\VSSSERVERBackup. (Refer to the backup plan of SSG for additional details.) In case the PrismServer and / or SSGVSS machines go down for any reason (OS crash, hard disk crash, etc), it should be possible to restore the contents of these machines using the weekly DVD backups and the incremental backups. Weekly backup DVD’s are stored in KC and these get updated on a weekly basis.
5.1 Recovery Procedure Hardware breakdown i.e. server malfunctions, drive crash etc.-- the machine’s
hard disk can be replaced within 24 hours. This includes the time required to restore the Operating System from the backup and then backup the contents relevant to SSG onto the machine.
OS malfunctions in any of the servers can typically be rectified in an hour’s time. Data corruption on the servers can be handled through restoration from previous
day’s backup in a maximum of 4 hours’ time.
Version 2.1 Preparation Date: 22 Feb 07
Page 9 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
6 Preventive MaintenanceAs per the Disaster Recovery of Datamatics Technologies Limited, the execution and monitoring of BCP/ DR plans in respect of Building & Furniture, Hardware, Software, Communication Lines (Leased) and Tele-communication lines does not fall under the purview of Group Head. However in respect of resources, Group Head is responsible for following activities, which are currently ongoing in SSG:
Maintenance of SSG’s personnel skill matrix Practice of SSG’s DR drill Intra group training for multiple tasks required during DR Inter group training for cross functional tasks during DR Maintain address and contact numbers of critical resources
6.1 Skill Matrix
This is being maintained by the Training Cell and periodically updated.
6.2 DR Practice Exercise
In case of a requirement, the SSG team may operate out of Knowledge Centre building. Following tests shall be carried out once every year as a practice for DR:
a) Connectivity to TJAM domain from the work stations at KC is getting established b) Data transfer to/from PrismServer and SSGVSS is smooth.c) Source code and executables from the DVD’s can be restored d) Software development using MS Visual Studio can happen seamlesslye) Third party tools can be installed and accessed from the workstation.
For the purpose of the above test, we shall start with an assumption that a computer with the configuration mentioned in Annexure II is made available to us along with the standard suite of software that is available on all DTL computers (e.g. MS Office, Symantec Anti-Virus, Screen Saver, etc). For SSG’s use, this computer should have Microsoft Visual Studio 6.0 and Visual Studio .Net 2005 installed on it. Further, it shall also be assumed that Operations group shall ensure that this machine is on the TJAM domain. And network connectivity from the computer is also available. Access to a DVD drive should also be made available.
6.3 Intra Group and Intergroup TrainingThis is conducted on a regular basis through knowledge sharing sessions that are conducted periodically with most of the team members attending these.
6.4 Team Contact Details Team member contact list can be obtained from the HR at KC. Alternately, it can be obtained from http://dtl1/ssg (or http://172.1.254.23/ssg) SSG Dash Board SSG Org Chart.
Version 2.1 Preparation Date: 22 Feb 07
Page 10 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
7 Emergency Operating EnvironmentTo create an environment for facilitating all business operations during occurrence of a disaster, we need to understand the impact of various disasters. This would help in creating an environment suitable for the smooth business operations and continuity of all business related deliveries. The following paragraph depict the impact of various possible disasters for SSG Group:
7.1 Disaster Centric Impacts7.1.1 Fire
Building & Furniture Resources Hardware Software Communication Lines
7.1.2 Natural Calamities and Man made disasters Building & Furniture Resources Hardware Software Communication Lines
7.1.3 Non-Availability of Employees Resources
7.1.4 Hardware Breakdown Servers Work Stations Communication Lines
7.1.5 Cyber Attack & Computer Virus Servers Work Stations Operating Systems Data
7.1.6 Telecom link failure Communication Lines Network disruption Data link
7.1.7 Power failure Hardware
Version 2.1 Preparation Date: 22 Feb 07
Page 11 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
Communication Lines
7.1.8 Failure of AC Hardware Communication Lines
7.2 Resultant ScenariosThe resultant scenario of above-mentioned impacts could be classified in terms of minor and major breakdowns. The minor breakdowns can mostly be taken care of by Operations and Administration and they may not involve any relocation of the resources. Operations and Administration would also take care of major breakdowns but they would also require active participation of SSG Group. A major breakdown may or may not call for the relocation of resources. The possible reasons for these breakdowns and their associated recovery plans are discussed in the succeeding paragraphs.
7.2.1 Minor BreakdownGroup is unlikely to perform business operations for a short duration of time (2-3 hours) due to power failure, non-availability of air-conditioning, breakdown of servers, malfunction of individual workstations, temporary network disruption and data link failures etc. These are routine phenomenon and generally do not affect the business operations. System recovery Procedure has already been discussed at para 6. While restoration of power and air-conditioning will be taken care of by administration, restoration of network and data link will be handled by operations.
7.2.2 Major BreakdownThe business operations of the Group are affected due to any of the disaster centric impacts and the restoration process may take longer time (more than a day). Under these circumstances, the recovery plan would depend upon the magnitude of disaster, taking into account its volume and gravity. Group will handle these outages as follows:
7.2.2.1 Non-availability of Building & Furniture
If Technology Centre (TC) is not operational but Knowledge Centre (KC) is operational or vice versa
In case TC is damaged due to fire or any other natural calamity, efforts will be made to relocate the human resources in KC. This arrangement may also involve procurement of new workstations in case the same are damaged along with the infrastructure. Workstations will have to be provided to these resources and operations will have to ensure their connectivity to existing Servers. Depending on the extent of the damage, time to get new machines for each of the human resources may vary.
In case there is damage at KC, if the Prismserver is available, then operations of the SSG team can continue uninterrupted. However, if the PrismServer becomes unavailable, then operations will have to provide a new machine, restore its contents from backup tapes/
Version 2.1 Preparation Date: 22 Feb 07
Page 12 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
DVD’s and then make it available to the SSG team. This operation can take upto 8 hrs after the availability of a new server machine.
All other incidents (such as both TC and KC not being available) will be considered as a major disaster and SSG does not intend to have a BCP for such a situation. Under such circumstances, the SSG group shall wait for the BCP/ DR of the organization to kick in and subsequently decide on a suitable course of action.
7.2.2.2 Non-availability of ResourcesNon-availability of resources could be partial or total and this could be apportioned to any of the following reasons:
Accident while at work Total disruption of traffic due to heavy storm/rain/flooding Political “bandh” leading to non-availability of transports Railway/Transporter’s strike
The Group head will liaise with administration for alternate means of transportation of its resources in case the conventional mobility is not likely to be restored within a day or two. The resources will be contacted telephonically (land line / cellular phones) and they will be advised to assemble at pre-defined pick-up points.
Instituting processes and documentation will enable new resources to understand the development methodology. The processes and methodologies that need to be followed for software development are available as part of the QMS documentation. Reading these documents will make the new resources aware of how to go about developing new applications.
7.2.2.3 Non-availability of ServersThe servers may become unavailable due to one of the following reasons:
Permanent damage due to fire / natural calamity Hard disk crash Failures of major component like PSU etc. Malfunction of OS
In case of permanent damage, servers will have to be replaced on priority followed by installation of OS/application software/data from the license copies / back-up tapes. Operations will either provide the servers from reserve stock if any or alternatively new machines will be hired / purchased.
Version 2.1 Preparation Date: 22 Feb 07
Page 13 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
8 ANNEXURES
8.1 ANNEXURE 1 : Process Details
BPO Production Support New Application Development Process1 Review specs released in PrismWorkBench2 Allocate Human Resource 3 Meeting with Production staff to clarify reqts4 Design and code (re-use earlier code, if available)5 Code Reviews (if critical)6 Functional Verification by SSG testers/ production staff7 Re-work (if any) based on feedback8 Release of software application through PrismWorkbench
Maintenance Process1 Change to existing application requested by production staff2 Meeting with production staff to understand reqt thoroughly3 Locate source code and make changes4 Test the changes to see if change request reqt is met5 Regression test (to verify that no pre-existing function is broken)6 Release software for verification by production team
Enterprise Tools (e.g. intranet applications for Payroll, HR, Operations)
1
Review requirements specified by mail or understand requirements by face-to-face meeting
2 Prepare high level estimate3 Design and code4 Testing5 Release for verification6 Re-work (if any) based on feedback7 Official roll-out of software
Version 2.1 Preparation Date: 22 Feb 07
Page 14 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
8.2 ANNEXURE II: Inventory of Hardware
Inventory of Hardware
SERVER CONFIG. RAM HDDFDD & CDR BACKUP BACKUP BACKUP BACKUP
MEDIA FREQ. LOCATION RESPON.PrismServer (owned by Operations) HDD Daily Incremental
\\ssgvss\d$\ prismserverbackup
SSG SQA person
PrismServer (owned by Operations) DVD’s Weekly Full Set of 3 DVD’s
SSG SQA person
SSG VSS (owned by SSG) HDD Daily Incremental
\\fcub026\ Vsserverbackup
SSQ SQA person
SSG VSS (owned by SSG) DVD Weekly Full One DVD
SSQ SQA person
WORKSTATIONS:-
Desktop PCs: Pentium Processor (one per individual in the SSG group) – approx. 35 persons in Mumbai office
~3GHz clock speed
512 MB RAM,
40 GB HDD
Not available Not available
Not required as all source code and
released software is on the Prism+ server/ VSS
Version 2.1 Preparation Date: 22 Feb 07
Page 15 of 16
Datamatics Technologies Limited SSG – Disaster Recovery Plan
8.3 ANNEXURE III: Assets Details
Assets Storage Locations Requirement ToolCategorization of Risk
1.0 Documentaion 1.1 Software specifications On Prism+ Critical
1.2Work Schedule / Project
Plan On Prism+Needs Daily back up Necessary
1.3 Source Code On Prism+ & VSSNeeds Daily back up
MS Visual Studio Critical
1.4 Reports On Prism+Needs Daily back up
1.4.1 - Estimated vs
Actual time On Prism+Needs Daily back up Necessary
1.4.2 - Pending activities On Prism+Needs Daily back up Necessary
1.4.3 - Allocated
activities On Prism+Needs Daily back up Necessary
1.5 Released software On Prism+Needs Daily back up
Stand-alone set up Critical
1.6 Team Structure With SSG HeadNeeds Daily back up MS Powerpoint Necessary
1.7 Quality Documents ln_Intranet ServerNeeds Daily back up MS Word Essential
2.0 Mail Server Local ServerMail Exchange server Critical
3.0 Communication
Telephones & Audio/Video conferencing facilities Essential
4.0 Training 4.1 Project Skills Matrix Training Department Essential4.2 Training aids training room Projector Necessary
5.0 Resources (Human) Critical
Version 2.1 Preparation Date: 22 Feb 07
Page 16 of 16