disasterrecoveryingres-wp-cn

Upload: tingisha

Post on 05-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    1/12

    Disaster recovery Planningfor ingressp i d i r

    i i

    WHITEPAPER

    by Chip NiCkolett, iNgres CorporatioN

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    2/12

    t c ::

    table of contents:

    3 pfc

    3 ovvw

    4 Dvn pn

    4 aumn

    5 Cumzn nd Cnun

    5 rn D

    6 tun

    7 am !

    9 summ

    10 N

    About the Author

    Chip Nickolett is the Director o Consulting Services or the Americas at Ingres. He has over

    23 years o business and I experience, and has been using Ingres RDBMS technology or over

    20 years. Chip has been developing Disaster Recovery Plans since 1991, and he is a recognized

    expert in perormance tuning and management o mission-critical Ingres installations.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    3/12

    D r P i ::::

    Disaster recovery Planning for ingres

    PrefaceTe unexpected has just happened. Te cost and business impact to your company now

    depends on just how well prepared you are. Tis white paper discusses key concepts and

    considerations in planning and preparing or the post-disaster recovery o an Ingres installation.

    overviewDisaster recovery planning can mean dierent things to dierent people. o many organizations,

    it is as simple as having osite backups available. o other organizations it involves having warm

    sites or machine restoration and inrastructure, and even workspace or business continuity. In

    order to keep this article brie and ocused on Ingres, we will primarily review inormation and

    issues specic to the recovery o an Ingres installation on an alternate machine.

    Does your site require a Disaster Recovery Plan (DRP), and i so, why? Tis is a question that

    only you can answer, but here are a couple o questions to think about: Does your organization

    rely on applications that run on Ingres? What is the business impact o not having those

    applications available? Is there a cost associated with not having these systems available? For

    example, i your sales order entry and processing system is not available, what is the cost o lost

    sales and/or products not delivered on schedule? What will the perception o your company be

    i your customers are unable to obtain products or services when they need them? Are there

    service level agreements that your organization has committed to? Te answers to these

    questions can give you a good idea as to whether or not a disaster recovery plan is required or

    your site.

    What is the cost of a DRP? Tere is no simple answer for this, but assume that it will be

    expensive. It will require plan development and testing, project management, additional

    hardware and inrastructure (purchased, leased, or provided by a third party), and requent plan

    review and testing. Generally a cost/benet analysis is perormed to determine i the cost o asignicant loss o service (generally > 24 hours) and/or loss o data exceeds the cost o developing

    and maintaining a DRP.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    4/12

    D r P i ::

    DeveloPing a PlanPlan development is very important. Inormation needs to be comprehensive and extremely

    detailed. Nothing should be le to chance. Te plan and procedures should be developed as i

    someone with only minimal technical experience is going to execute them (as, indeed, may be

    the case). Specic commands should be provided, along with representative output rom those

    commands. Each specic procedure then provides background as to the goal o the procedure,

    prerequisite steps, detailed instructions, and troubleshooting. A decision tree that is easy to read

    and ollow (see example below), and that reerences specic sections within a procedure, should

    be created along with the detailed procedures. Tese become living and breathing documents

    that need to be reviewed and updated requently. Below are some specic issues that need to be

    addressed or an Ingres DRP.

    Do you know what your specic Ingres release and patch levels are? Is your site running with

    any non-standard binaries/executables? Do you have the installation and patch media available

    at an osite location? Has that media been validated? Without having this baseline product

    inormation and installation media, your site could be orced to recover under a dierent

    version o Ingres. Dont assume that you will be able to obtain the desired release or patch level

    rom Ingres Corporation, since they primarily provide only current releases and patch levels.

    Running an untested release/patch level has the potential to introduce new problems and should

    thereore be avoided whenever possible.

    assumPtionsFirst, it is assumed that your recovery machine will be o the same type and capacity as the

    production machine. Te machine can be larger and/or have greater capacity, but trying to

    recover on a dierent platorm or on a machine that has less capacity introduces numerous

    issues and signicantly complicates this process. Te goal is to make the recovery machine look

    and behave identically to the production machine. Tis includes peripheral devices such as tape

    drives. For example, i the production machine perorms a parallel checkpoint to several tape

    drives then ideally the recovery machine should have the same number o compatible drives,

    and those tape devices should have the same system identication. Failure to do this will requiremanual intervention and likely script / command le modications.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    5/12

    D r P i ::

    customization anD configurationHave there been any site-specic modications? wo examples o this that I have encountered

    several times are customized keymap les and customized checkpoint les. Other inormation

    that should be gathered on a requent basis is symbol table inormation/logical denitions, a

    list o physical locations and paths, inodb output or all databases (including iidbdb), copies

    o the cong.dat and protect.dat les, and ASCII output rom the SQL statements help table *;

    help index *; select * rom iile_ino;. Finally, dont orget about operating system conguration.

    Ingres generally requires some OS tuning on most platorms to run and thereore it is important

    to have that inormation readily available. Tis inormation will allow you to re-create your

    production environment in the shortest possible amount o time, and provide additional

    inormation required to troubleshoot problems.

    restoring the Data

    Once the machine is congured and Ingres is installed it will be time to restore the database.

    Tis can either be perormed using the rollorwarddb process, or using the output rom

    unloaddb (ASCII unloads, while larger, are preerred or portability and recoverability reasons).

    Hopeully the media containing the recovery data has been validated and is good. It is good

    practice to have more than one set o media available or recovery in the event o problems. Tis

    could either be duplicate / mirror copies rom the same date, or a backup rom an earlier date.

    Generally it is desirable to try to recover up to the point o ailure, and that requires current

    journal les and the Ingres conguration le (aaaaaaaa.cn). One method o minimizing data

    loss is to have a process that periodically copies these les to another machine, ideally at another

    location.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    6/12

    D r P i ::

    troubleshootingCommon problems encountered during recovery on another machine include: missing or in-

    correct directory path; incorrect directory permission or ownership; OS kernel parameters are

    insufcient to support Ingres; the hostname is incorrect. Whenever possible the hostname o

    the recovery machine should be the same as on the production machine. Tis is because In-

    gres embeds that inormation in several locations and les. I using the same hostname is not

    possible then Ingres will likely require manual changes to the conguration les beore it will

    start. Tose changes including substitution o the new hostname or the old hostname in the

    $II_SYSEM/ingres/les/cong.dat and protect.dat les, possible changes to the symbol

    table (use ingprenv to view the current settings and ingsetenv to make changes). I your

    installation uses Ingres/NE then you will also need to rename les in the $II_SYSEM/ingres/

    les/name directory, again substituting the current hostname with the old hostname (this time or

    the le name, not the contents). I you are still encountering problems then review the $II_SYS-

    EM/ingres/les/errlog.log le to see what Ingres states the problem is.

    With some o the older versions o Ingres (OpenIngres 2.0 to some early patch levels o

    Advantage Ingres 2.5) may encounter problems with LicenseI, the license management system

    used many years ago. LicenseI uses the MAC address on a machine to uniquely identiy it

    or the purposes o authentication. Tis can cause problems with things like redundant network

    cards on the production machine, and will almost denitely cause problems on a dierent

    machine. Te very rst thing that should be done i a recovery scenario is anticipated is to

    contact Ingres echnical Support and request a temporary license le (a.k.a. lieboat string).

    I you are not using other CA products that use LicenseI then it is possible to just perorm a

    resh, generic install and then overlay the non-license directories rom a backup tape (on Unix

    the license directories are /ca_licand /usr/local/Calib). Tis will place your installation in the

    30-day grace period and provide an extra buer or getting an authorized license le.

    On even older Ingres 6.x or OpenIngres 1.x Installations you may need to contact Ingres

    echnical Support or a lieboat string or the license string (ound in the symbol table and

    displayed using the ingprenv command). Tese older versions are mentioned because many

    are still in production - a testament to the production quality and enterprise strength o Ingres.

    But, or the sake o support and recovery, it should be noted that it is always a best practice tohave production systems running currently support versions o products.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    7/12

    D r P i ::

    almost there!Once the procedures have been developed and unit tested, it is time or a comprehensive test

    o the entire plan. During plan execution it is important to take good notes on everything that

    is done or that occurs. Tis provides data or analysis in the event o problems (e.g., did a step

    ail because a predecessor step was not executed?), including specic error messages and

    inormation about the problem. It is important to collect this inormation as you go since it is

    very easy to orget what was done or exactly what happened. It is also good practice to document

    the amount o time required or each step. Tat will allow you to provide accurate estimates in

    the event o a real recovery situation.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    8/12

    D r P i ::

    Once the Ingres installation is restored how will you know that it works? Te validation processneeds to include some type o data validation, validation that your customers can access their

    applications, and that those applications can access the database. I any special equipment is

    required (barcode scanners, printers, wireless devices, etc.) then it is important to test those

    devices as well. Oen you will nd that little problems such as hostnames or IP addresses being

    dierent cause startup or connectivity problems, or routing and/or rewall problems cause

    remote access issues. Tese are the types o issues that the ull test o the plan is intended to

    uncover. Tere is usually, but not always, a work-around or these problems. For that reason it is

    important to develop a contingency plan or areas that are more prone to encountering problems.

    And nally, once everything is up and running, has been validated, and is ready to go, your rstinstinct will probably be to allow users into the system, but resist! We recommend that you

    checkpoint the database(s) and enable journaling. Tis provides the maximum amount o

    protection while running at the hotsite, and makes it easier to transport the new database back

    to the production machines once that environment has been restored. Aer the checkpoint

    completes then it is sae to proceed with business as usual. Please note that this is now your

    production environment and should therefore be treated as such. Te restored environment should

    include routine maintenance and checkpoints, just as would be done in ordinary production.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    9/12

    D r P i ::

    summaryWith the proper planning it is possible to restore mission-critical systems in a minimal amount

    o time. When care is taken to address the specic issues mentioned above, it is airly easy to

    restore a production Ingres installation. Te procedures developed can also be used or things

    such as system migrations, addressing minor ailures, and DBA training, so there is other value

    associated with their development. A good DRP is similar to a lie insurance policy. Premiums

    are paid on a regular basis with the hope o not having to take advantage o that policy. But, i

    and when needed, that protection can prove to be priceless.

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    10/12

    10D r P i ::

    notes

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    11/12

    11D r P i ::

    notes

  • 8/2/2019 DisasterRecoveryIngres-WP-CN

    12/12

    200 i cp. a d. Pd u.s.a. i dk i cp ud s d .

    INGRES CORPORATION : 00 arguello street : suite 200 : reDwooD city, california 0PHONE0..00 : FAX 0..0 : www.ingres.com : f , @.

    About Ingres Corporation

    i cp d pd p d . b 2 , i d

    d , pd p p d d x p . t p

    pp d p pd i pp. i j dp, d pp

    d, pp d ud s d .