disasterrecoveryingres-wp-cn
TRANSCRIPT
-
8/2/2019 DisasterRecoveryIngres-WP-CN
1/12
Disaster recovery Planningfor ingressp i d i r
i i
WHITEPAPER
by Chip NiCkolett, iNgres CorporatioN
-
8/2/2019 DisasterRecoveryIngres-WP-CN
2/12
t c ::
table of contents:
3 pfc
3 ovvw
4 Dvn pn
4 aumn
5 Cumzn nd Cnun
5 rn D
6 tun
7 am !
9 summ
10 N
About the Author
Chip Nickolett is the Director o Consulting Services or the Americas at Ingres. He has over
23 years o business and I experience, and has been using Ingres RDBMS technology or over
20 years. Chip has been developing Disaster Recovery Plans since 1991, and he is a recognized
expert in perormance tuning and management o mission-critical Ingres installations.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
3/12
D r P i ::::
Disaster recovery Planning for ingres
PrefaceTe unexpected has just happened. Te cost and business impact to your company now
depends on just how well prepared you are. Tis white paper discusses key concepts and
considerations in planning and preparing or the post-disaster recovery o an Ingres installation.
overviewDisaster recovery planning can mean dierent things to dierent people. o many organizations,
it is as simple as having osite backups available. o other organizations it involves having warm
sites or machine restoration and inrastructure, and even workspace or business continuity. In
order to keep this article brie and ocused on Ingres, we will primarily review inormation and
issues specic to the recovery o an Ingres installation on an alternate machine.
Does your site require a Disaster Recovery Plan (DRP), and i so, why? Tis is a question that
only you can answer, but here are a couple o questions to think about: Does your organization
rely on applications that run on Ingres? What is the business impact o not having those
applications available? Is there a cost associated with not having these systems available? For
example, i your sales order entry and processing system is not available, what is the cost o lost
sales and/or products not delivered on schedule? What will the perception o your company be
i your customers are unable to obtain products or services when they need them? Are there
service level agreements that your organization has committed to? Te answers to these
questions can give you a good idea as to whether or not a disaster recovery plan is required or
your site.
What is the cost of a DRP? Tere is no simple answer for this, but assume that it will be
expensive. It will require plan development and testing, project management, additional
hardware and inrastructure (purchased, leased, or provided by a third party), and requent plan
review and testing. Generally a cost/benet analysis is perormed to determine i the cost o asignicant loss o service (generally > 24 hours) and/or loss o data exceeds the cost o developing
and maintaining a DRP.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
4/12
D r P i ::
DeveloPing a PlanPlan development is very important. Inormation needs to be comprehensive and extremely
detailed. Nothing should be le to chance. Te plan and procedures should be developed as i
someone with only minimal technical experience is going to execute them (as, indeed, may be
the case). Specic commands should be provided, along with representative output rom those
commands. Each specic procedure then provides background as to the goal o the procedure,
prerequisite steps, detailed instructions, and troubleshooting. A decision tree that is easy to read
and ollow (see example below), and that reerences specic sections within a procedure, should
be created along with the detailed procedures. Tese become living and breathing documents
that need to be reviewed and updated requently. Below are some specic issues that need to be
addressed or an Ingres DRP.
Do you know what your specic Ingres release and patch levels are? Is your site running with
any non-standard binaries/executables? Do you have the installation and patch media available
at an osite location? Has that media been validated? Without having this baseline product
inormation and installation media, your site could be orced to recover under a dierent
version o Ingres. Dont assume that you will be able to obtain the desired release or patch level
rom Ingres Corporation, since they primarily provide only current releases and patch levels.
Running an untested release/patch level has the potential to introduce new problems and should
thereore be avoided whenever possible.
assumPtionsFirst, it is assumed that your recovery machine will be o the same type and capacity as the
production machine. Te machine can be larger and/or have greater capacity, but trying to
recover on a dierent platorm or on a machine that has less capacity introduces numerous
issues and signicantly complicates this process. Te goal is to make the recovery machine look
and behave identically to the production machine. Tis includes peripheral devices such as tape
drives. For example, i the production machine perorms a parallel checkpoint to several tape
drives then ideally the recovery machine should have the same number o compatible drives,
and those tape devices should have the same system identication. Failure to do this will requiremanual intervention and likely script / command le modications.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
5/12
D r P i ::
customization anD configurationHave there been any site-specic modications? wo examples o this that I have encountered
several times are customized keymap les and customized checkpoint les. Other inormation
that should be gathered on a requent basis is symbol table inormation/logical denitions, a
list o physical locations and paths, inodb output or all databases (including iidbdb), copies
o the cong.dat and protect.dat les, and ASCII output rom the SQL statements help table *;
help index *; select * rom iile_ino;. Finally, dont orget about operating system conguration.
Ingres generally requires some OS tuning on most platorms to run and thereore it is important
to have that inormation readily available. Tis inormation will allow you to re-create your
production environment in the shortest possible amount o time, and provide additional
inormation required to troubleshoot problems.
restoring the Data
Once the machine is congured and Ingres is installed it will be time to restore the database.
Tis can either be perormed using the rollorwarddb process, or using the output rom
unloaddb (ASCII unloads, while larger, are preerred or portability and recoverability reasons).
Hopeully the media containing the recovery data has been validated and is good. It is good
practice to have more than one set o media available or recovery in the event o problems. Tis
could either be duplicate / mirror copies rom the same date, or a backup rom an earlier date.
Generally it is desirable to try to recover up to the point o ailure, and that requires current
journal les and the Ingres conguration le (aaaaaaaa.cn). One method o minimizing data
loss is to have a process that periodically copies these les to another machine, ideally at another
location.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
6/12
D r P i ::
troubleshootingCommon problems encountered during recovery on another machine include: missing or in-
correct directory path; incorrect directory permission or ownership; OS kernel parameters are
insufcient to support Ingres; the hostname is incorrect. Whenever possible the hostname o
the recovery machine should be the same as on the production machine. Tis is because In-
gres embeds that inormation in several locations and les. I using the same hostname is not
possible then Ingres will likely require manual changes to the conguration les beore it will
start. Tose changes including substitution o the new hostname or the old hostname in the
$II_SYSEM/ingres/les/cong.dat and protect.dat les, possible changes to the symbol
table (use ingprenv to view the current settings and ingsetenv to make changes). I your
installation uses Ingres/NE then you will also need to rename les in the $II_SYSEM/ingres/
les/name directory, again substituting the current hostname with the old hostname (this time or
the le name, not the contents). I you are still encountering problems then review the $II_SYS-
EM/ingres/les/errlog.log le to see what Ingres states the problem is.
With some o the older versions o Ingres (OpenIngres 2.0 to some early patch levels o
Advantage Ingres 2.5) may encounter problems with LicenseI, the license management system
used many years ago. LicenseI uses the MAC address on a machine to uniquely identiy it
or the purposes o authentication. Tis can cause problems with things like redundant network
cards on the production machine, and will almost denitely cause problems on a dierent
machine. Te very rst thing that should be done i a recovery scenario is anticipated is to
contact Ingres echnical Support and request a temporary license le (a.k.a. lieboat string).
I you are not using other CA products that use LicenseI then it is possible to just perorm a
resh, generic install and then overlay the non-license directories rom a backup tape (on Unix
the license directories are /ca_licand /usr/local/Calib). Tis will place your installation in the
30-day grace period and provide an extra buer or getting an authorized license le.
On even older Ingres 6.x or OpenIngres 1.x Installations you may need to contact Ingres
echnical Support or a lieboat string or the license string (ound in the symbol table and
displayed using the ingprenv command). Tese older versions are mentioned because many
are still in production - a testament to the production quality and enterprise strength o Ingres.
But, or the sake o support and recovery, it should be noted that it is always a best practice tohave production systems running currently support versions o products.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
7/12
D r P i ::
almost there!Once the procedures have been developed and unit tested, it is time or a comprehensive test
o the entire plan. During plan execution it is important to take good notes on everything that
is done or that occurs. Tis provides data or analysis in the event o problems (e.g., did a step
ail because a predecessor step was not executed?), including specic error messages and
inormation about the problem. It is important to collect this inormation as you go since it is
very easy to orget what was done or exactly what happened. It is also good practice to document
the amount o time required or each step. Tat will allow you to provide accurate estimates in
the event o a real recovery situation.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
8/12
D r P i ::
Once the Ingres installation is restored how will you know that it works? Te validation processneeds to include some type o data validation, validation that your customers can access their
applications, and that those applications can access the database. I any special equipment is
required (barcode scanners, printers, wireless devices, etc.) then it is important to test those
devices as well. Oen you will nd that little problems such as hostnames or IP addresses being
dierent cause startup or connectivity problems, or routing and/or rewall problems cause
remote access issues. Tese are the types o issues that the ull test o the plan is intended to
uncover. Tere is usually, but not always, a work-around or these problems. For that reason it is
important to develop a contingency plan or areas that are more prone to encountering problems.
And nally, once everything is up and running, has been validated, and is ready to go, your rstinstinct will probably be to allow users into the system, but resist! We recommend that you
checkpoint the database(s) and enable journaling. Tis provides the maximum amount o
protection while running at the hotsite, and makes it easier to transport the new database back
to the production machines once that environment has been restored. Aer the checkpoint
completes then it is sae to proceed with business as usual. Please note that this is now your
production environment and should therefore be treated as such. Te restored environment should
include routine maintenance and checkpoints, just as would be done in ordinary production.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
9/12
D r P i ::
summaryWith the proper planning it is possible to restore mission-critical systems in a minimal amount
o time. When care is taken to address the specic issues mentioned above, it is airly easy to
restore a production Ingres installation. Te procedures developed can also be used or things
such as system migrations, addressing minor ailures, and DBA training, so there is other value
associated with their development. A good DRP is similar to a lie insurance policy. Premiums
are paid on a regular basis with the hope o not having to take advantage o that policy. But, i
and when needed, that protection can prove to be priceless.
-
8/2/2019 DisasterRecoveryIngres-WP-CN
10/12
10D r P i ::
notes
-
8/2/2019 DisasterRecoveryIngres-WP-CN
11/12
11D r P i ::
notes
-
8/2/2019 DisasterRecoveryIngres-WP-CN
12/12
200 i cp. a d. Pd u.s.a. i dk i cp ud s d .
INGRES CORPORATION : 00 arguello street : suite 200 : reDwooD city, california 0PHONE0..00 : FAX 0..0 : www.ingres.com : f , @.
About Ingres Corporation
i cp d pd p d . b 2 , i d
d , pd p p d d x p . t p
pp d p pd i pp. i j dp, d pp
d, pp d ud s d .