how to survive a disaster with rman

28
Backup and Recovery HOW OW TO TO SURVIVE SURVIVE A DISASTER DISASTER WITH WITH RMAN RMAN Gustavo René Antúnez, The Pythian Group ABSTRACT Oracle Recovery Manager (Oracle RMAN) has been a feature in the Oracle Database since version 8.0, ever since then it has grown into a robust method to backup and recover your database. But with the life of DBA being ever – present in performance issues, user requests and even non related database situations like meetings, a backup is normally setup but, more often than not, it is forgotten to test out what would happen if a disaster would happen and what would you do. This paper will illustrate how to setup a proper RMAN backup configuration and real life RMAN disaster recovery scenarios so that if you are ever faced with this adversity you come out on top. TARGET AUDIENCE This paper will be beneficial for anyone who is beginning to use Oracle’s Recovery Manager and for those who are already using it, but want to understand the intricacies of how a configuration setup can hurt your recovery of your database. EXECUTIVE SUMMARY Learner will be able to: Gain insights into Oracle RMAN setup that can minimize the possibility of error during recovery Provide the skills to verify the integrity of a backup strategy Provide examples how these skills can help you solve a real life disaster HOW TO SURVIVE A DISASTER WITH RMAN In July of 2012 IOUG released its “2012 IOUG DATABASE AVAILABILITY SURVEY”, in which it questions 358 data managers and professionals about their data availability and how they have planned for it to be up and running, and a couple of points stand out: 45% cite human errors as the leading cause of unplanned outages About 40% say they experienced, in total, one day or more of unplanned outage last year 44% have active efforts underway to mitigate their levels of unexpected downtime 18% report that their organizations conduct no disaster recovery plan. While 37% conduct a DRP for some types/ areas of data. 34% estimate that they would need more than a business day (exceeding eight hours) to have everything up and running in case of a disaster. One point that stands out to me, is that 34% of the people surveyed said that they would need more that a business day to have their data available, and with the ever increasing need for data to be available 24/7, this is a real impact on the profits and necessities of your organization. But now the question that comes into mind is, what if you can reduce that time, by just having the right understanding on how Oracle’s Recovery Manager (RMAN) works, this can and will help you out in reducing your meant time to recover. 1 Session # 196

Upload: gustavo-rene-antunez

Post on 05-Dec-2014

3.460 views

Category:

Technology


3 download

DESCRIPTION

 

TRANSCRIPT

Page 1: How to survive a disaster with RMAN

Backup and Recovery

HHOWOW TOTO SURVIVESURVIVE AA DISASTERDISASTER WITHWITH RMAN RMAN

Gustavo René Antúnez, The Pythian Group

ABSTRACT Oracle Recovery Manager (Oracle RMAN) has been a feature in the Oracle Database since version 8.0, ever

since then it has grown into a robust method to backup and recover your database. But with the life of DBA being ever – present in performance issues, user requests and even non related database situations like meetings, a backup is normally setup but, more often than not, it is forgotten to test out what would happen if a disaster would happen and what would you do. This paper will illustrate how to setup a proper RMAN backup configuration and real life RMAN disaster recovery scenarios so that if you are ever faced with this adversity you come out on top.

TARGET AUDIENCE This paper will be beneficial for anyone who is beginning to use Oracle’s Recovery Manager and for those who

are already using it, but want to understand the intricacies of how a configuration setup can hurt your recovery of your database.

EXECUTIVE SUMMARY

Learner will be able to:

• Gain insights into Oracle RMAN setup that can minimize the possibility of error during recovery

• Provide the skills to verify the integrity of a backup strategy

• Provide examples how these skills can help you solve a real life disaster

HOW TO SURVIVE A DISASTER WITH RMAN

In July of 2012 IOUG released its “2012 IOUG DATABASE AVAILABILITY SURVEY”, in which it questions 358 data managers and professionals about their data availability and how they have planned for it to be up and running, and a couple of points stand out:

• 45% cite human errors as the leading cause of unplanned outages

• About 40% say they experienced, in total, one day or more of unplanned outage last year

• 44% have active efforts underway to mitigate their levels of unexpected downtime

• 18% report that their organizations conduct no disaster recovery plan. While 37% conduct a DRP for some types/areas of data.

• 34% estimate that they would need more than a business day (exceeding eight hours) to have everything up and running in case of a disaster.

One point that stands out to me, is that 34% of the people surveyed said that they would need more that a business day to have their data available, and with the ever increasing need for data to be available 24/7, this is a real impact on the profits and necessities of your organization.

But now the question that comes into mind is, what if you can reduce that time, by just having the right understanding on how Oracle’s Recovery Manager (RMAN) works, this can and will help you out in reducing your meant time to recover.

1 Session # 196

Page 2: How to survive a disaster with RMAN

Backup and Recovery

The first thing that we have to do, is understand what RMAN is and how RMAN works. Recovery Manager or better known as RMAN, is an Oracle client utility that is installed with the Enterprise or Standard edition, you can also find it with the Admin option when installing the Oracle Client.

In it's most basic form, the RMAN client connects, an it needs to be with a "sysdba" privileged user, to the database that is being backed up, called a TARGET database, this client, is an executable that is normally found in $ORACLE_HOME/bin.

By being a client side utility, it allows you to use one RMAN executable version to backup current and previous versions of the Oracle Database; there are some restrictions to this though, which can be verified in MOS document RMAN Compatibility Matrix [ID 73431.1].

This RMAN executable uses a file called recover.bsq, this file is located in $ORACLE_HOME/rdbms/admin , basically what the executable does, is to interpret the commands you give it , direct server sessions to execute those commands, and record its activity in the TARGET database control file that is being backed up.

There are two main SYS packages that do the work of backup and recovery, which are DBMS_RCVMAN, this has the procedures which list your database incarnations, the set until time recovery window, list your backups, to name a few, and DBMS_BACKUP_RESTORE , which as you might have guessed is the one who does the backup and recovery operations, like create the control file snapshot , backup the datafiles , backup the spfile to name some.

As mentioned above, the way that the RMAN client directs the server sessions to execute the commands are through channels, a channel represents one stream of data to a device, and corresponds to one database server session. The channel reads data into PGA memory, processes it, and writes it to the output device.

The work of each channel, whether of type disk or System Backup Tape (SBT), is subdivided into the following distinct phases:

• Read Phase

A channel reads blocks from disk into input I/O buffers. The allocation of these buffers depends on the number of datafiles being read simultaneously from disk and written to the same backup piece. One way to control the numbers of files is the backup parameter FILESPERSET

• Copy Phase

A channel copies blocks from input buffers to output buffers and performs additional processing on the blocks, like the validation of the data blocks, as it verifies that it's not backing up corrupt data blocks, it's also the phase where it does the binary compression and the backup encryption

• Write Phase

A channel writes the blocks from output buffers to storage media. The write phase can be either to SBT or to disk, and these are mutually exclusive, meaning you write to one or the other, not both.

As you can see by the phases above, and what distinguishes RMAN from any other method, is that the backup is at the block level, as to the user managed backups, it brings great advantages, as it wont have to backup empty blocks.

When you decide to write to SBT, Oracle uses its Media Manager Layer (MML) to allow RMAN to communicate with 3rd party vendors and write the resulting backup piece to the sequential media device. Oracle uses a library, which is really a symbolic link to the actual 3rd party media manager, located in $ORACLE_HOME/lib, called libobk.so (orasbt.dll in Windows), for example if you were to setup Oracle RMAN 11.2.0.3 in RHEL 5 64-bit version with Veritas Netbackup, you would have to do the following:

cd $ORACLE_HOME/lib

mv libobk.so libobk.so.orig

2 Session # 196

Page 3: How to survive a disaster with RMAN

Backup and Recovery

ln -s /<install_path>/netbackup/bin/libobk.so64 libobk.so

Establishing this link only enables RMAN to be able to pass commands to the MML and this one will interact with the media manager, but for you to be able to complete a backup to SBT, you have to establish the necessary parameters in the channel allocation, with the PARMS parameter.

This is vendor specific, so you will see, as an example, for Veritas Netbackup you have parameters like NB_ORA_SERV, NB_ORA_CLIENT, NB_ORA_POLICY, and for Tivoli Storage Manager you specify a file called tdpo.opt through the parameter TDPO_OPTFILE, in which you set the appropriate settings for this vendor media manager.

As you can see below, the specification of the PARMS parameter is done when you either ALLOCATE CHANNEL or CONFIGURE CHANNEL.

RMAN> run2> {3> ALLOCATE CHANNEL CH1 DEVICE TYPE 'SBT_TAPE' PARMS'SBT_LIBRARY=<NBU_install_path>/netbackup/bin/libobk.so64,ENV=(NB_ORA_SERV = MASTER_SERVER,4> NB_ORA_POLICY=5week,NB_ORA_CLIENT=CLIENT_SERVER)' format '%d_TAPE_%I_%s_%T_%t';5> backup backupset all;6> }using target database control file instead of recovery catalogallocated channel: CH1channel CH1: SID=58 instance=TESTDB1 device type=SBT_TAPEchannel CH1: Veritas NetBackup for Oracle - Release 6.5 (2010042404)

WHAT DO YOU NEED TO BACKUP?

When you consider on using Oracle’s RMAN utility, there are files that are critical to backup so that you can be able to restore and recover your database in case a disaster were to occur:

• Data files

• Control files

• Archived Redo logs (If Database running in Archive Log mode)

But let’s say a full on disaster were to happen, in which not only you lose your database, you lose your site or your server, you need some additional files are needed so that you can have your data available as soon as possible:

• Parameter File (pfile1 or spfile)

• Block Change Tracking File1

• ORACLE_HOME/GRID_HOME1

• tnsnames.ora / listener.ora/ sqlnet.ora1

These additional files are not critical in RMAN to be backed up, but are very important in case a full disaster were to occur, as if you were not to have a backup of these, you would need to rebuild your ORACLE_HOME / GRID_HOME from scratch, taking you more time to have the data available to your client.

1These files are not backed up with Oracle’s RMAN utility, you need to use a distinct method to back up these types of files.

3 Session # 196

Page 4: How to survive a disaster with RMAN

Backup and Recovery

BENEFITS OF THE RMAN CATALOG

When using Oracle’s RMAN backup utility, all of the backup metadata regarding the database files and the backups that have been taken is stored in the RMAN Repository; By default, this repository is its own control file, but there is also another way to store this metadata, it is an RMAN Recovery Catalog, this catalog is a schema an typically in a Database by itself, though there is the inconvenience that you have to maintain another database, the benefits normally outweigh the disadvantages of this.

By having an RMAN recovery catalog, it creates a redundancy for the RMAN repository that is stored in the control file, so if you were to lose all of your control files, the metadata of your backups will still exist in this catalog. A recovery catalog also centralizes this metadata for all the Oracle databases that you backup in your organization, making backup reporting and administration tasks easier to perform.

One of the greatest benefits of having an RMAN recovery catalog is that it keeps backup history longer than the control file can keep it. If you ever have to change the DBID of your Database, all the backup information belonging to the current and previous parent incarnation2 would be lost, the only way to have a history of your backups, would be if you have an RMAN recovery catalog. In this same topic, setting CONTROL_FILE_RECORD_TIME_KEEP to a low value, will mean that the control file records can be lost if Oracle needs the space in the control file, as the backup records are reusable records

RMAN> list backup summary;

List of Backups

===============

Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag

------- -- -- - ----------- -------------------- ------- ------- ---------- ---

1 B F A DISK 16-FEB-2013 04:55:14 1 1 NO TAG20130216T045452

2 B F A DISK 16-FEB-2013 05:02:07 1 1 YES FULL_BACKUP

3 B A A DISK 16-FEB-2013 05:02:27 1 1 YES ARCH_BACKUP

4 B F A DISK 16-FEB-2013 05:02:44 1 1 NO CTL_BACKUP

5 B F A DISK 16-FEB-2013 05:02:56 1 1 NO TAG20130216T050247

RMAN> shutdown immediate

[email protected] [TESTDB1] /home/oracle/Desktop

oracle $ nid target=sys/oracle@TESTDB

Connected to database TESTDB (DBID=2590798818)

Connected to server version 11.2.0

Control Files in database:

+DB_DATA/testdb/controlfile/control01.ctl

+DB_DATA/testdb/controlfile/control02.ctl

Change database ID of database TESTDB? (Y/[N]) => y

2 A separate version of a database. The incarnation of the database changes when you open it with the RESETLOGS option, but you can recover backups from a prior incarnation as long as the necessary redo is available.

4 Session # 196

Page 5: How to survive a disaster with RMAN

Backup and Recovery

Database ID for database TESTDB changed to 2590844408.

All previous backups and archived redo logs for this database are unusable.

Database is not aware of previous backups and archived logs in Recovery Area.

Database has been shutdown, open database with RESETLOGS option.

Succesfully changed database ID.

DBNEWID - Completed succesfully.

[email protected] [TESTDB1] /home/oracle/Desktop

oracle $ sqlplus

TESTDB1> startup mount

ORACLE instance started.

Database mounted.

TESTDB1> alter database open resetlogs;

Database altered.

[email protected] [TESTDB1] /home/oracle/Desktop

oracle $ rman target /

connected to target database: TESTDB (DBID=2590844408)

RMAN> list incarnation;

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 TESTDB 2590844408 PARENT 735231 16-FEB-2013 04:53:05

2 2 TESTDB 2590844408 CURRENT 737037 16-FEB-2013 05:05:47

RMAN> list backup summary;

using target database control file instead of recovery catalog

List of Backups

===============

Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag

------- -- -- - ----------- -------------------- ------- ------- ---------- ---

1 B F A DISK 16-FEB-2013 05:08:00 1 1 NO TAG20130216T050739

CONSIDERATIONS BEFORE CONFIGURING YOUR RMAN BACKUPS

As we talked above there are several considerations before you even start taking your first RMAN backup

1. ARCHIVELOG vs. NOARCHIVELOGWhen deciding if you are going to take a cold (Consistent) or hot (Inconsistent) backup, you have verify the LOG_MODE of your database, as you can’t run a hot backup in a database running in NOARCHIVELOG mode

5 Session # 196

Page 6: How to survive a disaster with RMAN

Backup and Recovery

2. Use of an RMAN Recovery CatalogDo you have enough databases to require a centralized reporting schema? Do you have a problem maintaining another high availability database? Do you require a secondary backup metadata repository?

TESTDB1> CREATE TABLESPACE catalog;

Tablespace created.

TESTDB1> CREATE USER cat_user IDENTIFIED BY cat_user_password

2 DEFAULT TABLESPACE catalog;

User created.

TESTDB1> GRANT connect, resource, recovery_catalog_owner TO cat_user;

Grant succeeded.

[email protected] [TESTDB1] /home/oracle/Desktop

oracle $ rman catalog=cat_user/cat_user_password

RMAN> CREATE CATALOG;

recovery catalog created

3. Full or Incremental backups

a. Full BackupIncludes every allocated block in the file being backed up

b. Incremental Backups

• Level 0 is the base for subsequent incremental backups

• Level 1 incremental backup can be either of the following types:

i. Differential incremental backup. - Backs up all blocks changed after the most recent incremental backup at level 1 or 0.

ii. Cumulative incremental backup. - Backs up all blocks changed after the most recent incremental backup at level 0.

6 Session # 196

Page 7: How to survive a disaster with RMAN

Backup and Recovery

4. Value of CONTROL_FILE_RECORD_TIME_KEEPConsider the minimum time you require to have information in your Recovery Catalog, if you were to set it to a lower value, you can lose the capacity to recover your database to the desired time.

5. Set NLS parameters accordinglyMake sure your environment variables are set accordingly to the values of V$NLS_PARAMETERS in your database.

6. Image Copy or Backupset output

a. Image copy is an exact copy of a data file, control file or archived redo log, with this type of backup, you will have to use an external method of compression if you want to reduce the size of the backup.

b. Backupset A backupset is the smallest unit of an RMAN backup and it is the only form in which RMAN can write backups to SBT. When using a backupset as your output method you can also configure them to be:

i. CompressedWhich all but the BASIC option requires the Advanced Compression Option enabled

1. BASIC. - This does not require the Advanced Compression Option

2. LOW. - Least effect on backup throughput.

3. MEDIUM. - Recommended for most environments. Good combination of compression ratios and speed

4. HIGH. - Best suited for backups over slower networks where the limiting factor is network speed.

ii. Encrypted

1. Transparent. - Default mode if using encryption and uses Oracle Encryption Wallet

2. Password-protected. - This mode uses only password protection. You must provide a password when creating and restoring encrypted backups

3. Dual-mode. - This mode requires either the wallet or a password

You can also control them to be of a certain size with the configuration option of MAXSETSIZE and MAXPIECESIZE.

7 Session # 196

Page 8: How to survive a disaster with RMAN

Backup and Recovery

In addition you can control the number of channels available for a device type when you run a command with the configuration setting of PARALLELISM or by setting a number of channels within a run control block, this will determine whether RMAN reads or writes in parallel.

RMAN> CONFIGURE ENCRYPTION FOR DATABASE ON;

new RMAN configuration parameters:

CONFIGURE ENCRYPTION FOR DATABASE ON;

new RMAN configuration parameters are successfully stored

RMAN> SET ENCRYPTION ON IDENTIFIED BY rene ONLY;

executing command: SET encryption

RMAN> CONFIGURE DEVICE TYPE disk PARALLELISM 3;

new RMAN configuration parameters:

CONFIGURE DEVICE TYPE DISK PARALLELISM 3 BACKUP TYPE TO BACKUPSET;

new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE MAXSETSIZE TO 100M;

new RMAN configuration parameters:

CONFIGURE MAXSETSIZE TO 100 M;

new RMAN configuration parameters are successfully stored

RMAN> CONFIGURE COMPRESSION ALGORITHM 'BASIC';

new RMAN configuration parameters:

CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE;

new RMAN configuration parameters are successfully stored

7. SBT or Disk

When deciding if you are going to use SBT or Disk as your storage for your backup pieces or backup sets, you will have to take into consideration the price of backing up to disk, as it is more expensive backing to a disk device than to a tape device. You can also take into consideration the portability of a tape , as this can help in moving your backups to a disaster recovery site without much complications.

8. Fast Recovery Area (FRA)Optional disk location that you can use to store recovery-related files such as control file and online redo log copies, archived redo log files, flashback logs, and RMAN backups. Setting the following parameters will setup the FRA:

a. DB_RECOVERY_FILE_DEST

b. DB_RECOVERY_FILE_DEST_SIZE

If you decide to setup the FRA, you need to consider the following when setting up the size or quota of it

a. Size of your Database Backup

b. Size of incremental backups, as used by your chosen backup strategy

c. Size of flashback logs

d. Size of (n+1) days of Archived Redo logs

e. Size of your control files

f. Size of your online redo logs * number of groups

8 Session # 196

Page 9: How to survive a disaster with RMAN

Backup and Recovery

At the end, you will have to make a decision on your backup settings and strategy based on your business and data availability necessities, as most of the configurations above, will impact the Mean Time To Recover (MTTR) in case of a disaster.

CONFIGURING RMAN DEFAULT SETTINGS To facilitate the daily use of RMAN for your backup and recovery, it lets you set a number of persistent

configuration settings for each target database. There are several configurations that if set incorrectly or not accordingly you might not be able to meet your recovery necessities. You can view your RMAN settings with the SHOW ALL or your policy settings with the SHOW RETENTION POLICY commands

• CONFIGURE DEFAULT DEVICE TYPE TO DISK;

When formatting your output, be sure to always include the DBID, below are a couple of wildcards that will always include your DBID.

a. %d_%I_%s_%T_%t

b. %U

• CONFIGURE CONTROLFILE AUTOBACKUP ON;

This will make sure that you always have a secondary option

• CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 3 TIMES TO DISK;

• Retention policy settings

Your retention policy can either be based on the redundancy of backups you have taken (a), or the number of days before your backup becomes obsolete (b), these options are mutually exclusive

a. CONFIGURE RETENTION POLICY TO 3;

b. CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 8 DAYS;There must always exist one backup of each data file that satisfies SYSDATE - BACKUP CHECKPOINT TIME >= 8

c. CONFIGURE ARCHIVELOG DELETION POLICY TO BACKED UP 3 TIMES TO DISK;

This is critical when it is set to be equal or above the redundant retention policy you have, as losing even one archived redo log, will compromise your recoverability.

• CONTROL_FILE_RECORD_TIME_KEEP

Make sure that this database parameter is equal to or higher than your retention policy.

RMAN> CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/mount/copy01/TESTDB/%d_DB_BU_%I_%s_%T_%t';

new RMAN configuration parameters:

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/mount/copy01/TESTDB/%d_DB_BU_%I_%s_%T_%t';

new RMAN configuration parameters are successfully stored

RMAN> SHOW ALL;

using target database control file instead of recovery catalog

RMAN configuration parameters for database with db_unique_name TESTDB are:

CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 8 DAYS;

CONFIGURE BACKUP OPTIMIZATION ON;

CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default

CONFIGURE CONTROLFILE AUTOBACKUP ON;

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/mount/copy01/TESTDB/%F';

CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default

CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

9 Session # 196

Page 10: How to survive a disaster with RMAN

Backup and Recovery

CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default

CONFIGURE CHANNEL DEVICE TYPE DISK FORMAT '/mount/copy01/TESTDB/%d_DB_BU_%I_%s_%T_%t';

CONFIGURE MAXSETSIZE TO UNLIMITED; # default

CONFIGURE ENCRYPTION FOR DATABASE OFF; # default

CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default

CONFIGURE COMPRESSION ALGORITHM 'BASIC' AS OF RELEASE 'DEFAULT' OPTIMIZE FOR LOAD TRUE ; #default

CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default

CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/mount/copy01/control_snaps/snapcf_TESTDB1.f';

BACKING UP YOUR DATABASE

The main objectives of taking a physical backup of your database is data protection in case of media failure and data preservation in case you need preserve a copy of a database as it existed before a major release. Oracle RMAN utility uses the command BACKUP to accomplish this objective.

You can run this command in one line, and as you can see it has the ease and simplicity for any beginners to backup a database:

RMAN> BACKUP DATABASE PLUS ARCHIVELOG;

Starting backup at 17-FEB-2013 00:46:18

current log archived

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

skipping archived logs of thread 1 from sequence 1 to 3; already backed up

channel ORA_DISK_1: starting archived log backup set

channel ORA_DISK_1: specifying archived log(s) in backup set

input archived log thread=1 sequence=4 RECID=4 STAMP=807583580

channel ORA_DISK_1: finished piece 1 at 17-FEB-2013 00:47:00

piece handle=/mount/copy01/TESTDB/TESTDB_DB_BU_2590844408_38_20130217_807583616 tag=TAG20130217T004624 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03

Finished backup at 17-FEB-2013 00:47:00

Starting backup at 17-FEB-2013 00:47:00

current log archived

using channel ORA_DISK_1

using channel ORA_DISK_2

using channel ORA_DISK_3

channel ORA_DISK_1: starting archived log backup set

channel ORA_DISK_1: specifying archived log(s) in backup set

input archived log thread=1 sequence=5 RECID=5 STAMP=807583622

channel ORA_DISK_1: starting piece 1 at 17-FEB-2013 00:47:04

channel ORA_DISK_1: finished piece 1 at 17-FEB-2013 00:47:05

piece handle=/mount/copy01/TESTDB/TESTDB_DB_BU_2590844408_39_20130217_807583624 tag=TAG20130217T004703 comment=NONE

channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01

Finished backup at 17-FEB-2013 00:47:05

Starting Control File and SPFILE Autobackup at 17-FEB-2013 00:47:06

10 Session # 196

Page 11: How to survive a disaster with RMAN

Backup and Recovery

piece handle=/mount/copy01/TESTDB/c-2590844408-20130217-00 comment=NONE

Finished Control File and SPFILE Autobackup at 17-FEB-2013 00:47:21

Or have it in a run block command, adding complexity and the manageability that we have been talking about, as it will help you identify which backups are for what type of file and in a disaster this will help you recognize what is needed to have your data available as soon as possible, and also override certain persistent configurations you have set with RMAN.

In the example below, we are using a PARALELLISM of 2 instead of the 3 we have set in our configuration and we are defining within the command backup, the destination and format the backup piece we will have and as well we will delete any archived redo logs that we are backing up:

RMAN> RUN

2> {

3> SET COMMAND ID TO 'BACKUP_SESSION';

4> ALLOCATE CHANNEL CH1 DEVICE TYPE DISK;

5> ALLOCATE CHANNEL CH2 DEVICE TYPE DISK;

6> BACKUP AS COMPRESSED BACKUPSET DATABASE FORMAT '/MOUNT/COPY01/TESTDB/%d_DB_BU_%I_%s_%T_%t' TAG 'FULL_BACKUP';

7> BACKUP AS COMPRESSED BACKUPSET ARCHIVELOG ALL FORMAT '/MOUNT/COPY01/TESTDB/%d_ARCH_BU_%I_%s_%T_%t' TAG 'ARCH_BACKUP' DELETE INPUT;

8> BACKUP CURRENT CONTROLFILE FORMAT '/MOUNT/COPY01/TESTDB/%d_CTL_BU_%I_%s_%T_%t' TAG 'CTL_BACKUP';

9> }

executing command: SET COMMAND ID

using target database control file instead of recovery catalog

allocated channel: CH1

channel CH1: SID=56 device type=DISK

allocated channel: CH2

channel CH2: SID=50 device type=DISK

Starting backup at 17-FEB-2013 01:00:26

channel CH1: starting compressed full datafile backup set

channel CH1: specifying datafile(s) in backup set

input datafile file number=00002 name=+DB_DATA/testdb/datafile/sysaux.257.798230837

input datafile file number=00003 name=+DB_DATA/testdb/datafile/undotbs1.256.798230841

input datafile file number=00005 name=+DB_DATA/testdb/datafile/undotbs2.268.798234245

channel CH1: starting piece 1 at 17-FEB-2013 01:00:28

channel CH2: starting compressed full datafile backup set

channel CH2: specifying datafile(s) in backup set

input datafile file number=00001 name=+DB_DATA/testdb/datafile/system.260.798230829

input datafile file number=00004 name=+DB_DATA/testdb/datafile/users.264.798230845

channel CH2: starting piece 1 at 17-FEB-2013 01:00:29

channel CH1: finished piece 1 at 17-FEB-2013 01:01:04

piece handle=/mount/copy01/TESTDB/TESTDB_DB_BU_2590844408_48_20130217_807584427 tag=FULL_BACKUP comment=NONE

current log archived

channel CH1: starting compressed archived log backup set

channel CH1: specifying archived log(s) in backup set

input archived log thread=1 sequence=7 RECID=7 STAMP=807584478

11 Session # 196

Page 12: How to survive a disaster with RMAN

Backup and Recovery

channel CH1: starting piece 1 at 17-FEB-2013 01:01:19

channel CH1: finished piece 1 at 17-FEB-2013 01:01:20

piece handle=/mount/copy01/TESTDB/TESTDB_ARCH_BU_2590844408_51_20130217_807584479 tag=ARCH_BACKUP comment=NONE

channel CH1: backup set complete, elapsed time: 00:00:01

channel CH1: deleting archived log(s)

archived log file name=/mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/archivelog/2013_02_17/o1_mf_1_7_8l0wkfst_.arc RECID=7 STAMP=807584478

Finished backup at 17-FEB-2013 01:01:21

Starting backup at 17-FEB-2013 01:01:28

channel CH1: starting full datafile backup set

channel CH1: specifying datafile(s) in backup set

including current control file in backup set

channel CH1: starting piece 1 at 17-FEB-2013 01:01:36

channel CH1: finished piece 1 at 17-FEB-2013 01:01:39

piece handle=/mount/copy01/TESTDB/TESTDB_CTL_BU_2590844408_52_20130217_807584489 tag=CTL_BACKUP comment=NONE

channel CH1: backup set complete, elapsed time: 00:00:03

Finished backup at 17-FEB-2013 01:01:39

Starting Control File and SPFILE Autobackup at 17-FEB-2013 01:01:39

piece handle=/mount/copy01/TESTDB/c-2590844408-20130217-02 comment=NONE

Finished Control File and SPFILE Autobackup at 17-FEB-2013 01:01:55

released channel: CH1

released channel: CH2

As we mentioned at the considerations section of this document, with a backup set we can send a backup set to SBT, this you can accomplish it with the BACKUP command.

In the example below, we are emulating an SBT and sending the backup sets we have in disk to this media:

RMAN> RUN {

2> ALLOCATE CHANNEL CH1 TYPE 'SBT_TAPE'

3> PARMS="SBT_LIBRARY=oracle.disksbt,

4> ENV=(BACKUP_DIR=/MOUNT/COPY01/SBT)";

5> BACKUP BACKUPSET ALL;

6> }

using target database control file instead of recovery catalog

allocated channel: ch1

channel ch1: SID=56 device type=SBT_TAPE

channel ch1: WARNING: Oracle Test Disk API

Starting backup at 17-FEB-2013 01:22:39

channel ch1: input backup set: count=32, stamp=807583495, piece=1

channel ch1: starting piece 1 at 17-FEB-2013 01:22:44

channel ch1: backup piece /mount/copy01/TESTDB/TESTDB_DB_BU_2590844408_32_20130217_807583495

piece handle=10o25fo7_1_2 comment=API Version 2.0,MMS Version 8.1.3.0

channel ch1: finished piece 1 at 17-FEB-2013 01:22:47

12 Session # 196

Page 13: How to survive a disaster with RMAN

Backup and Recovery

...

channel ch1: backup piece /mount/copy01/TESTDB/c-2590844408-20130217-02

piece handle=c-2590844408-20130217-02 comment=API Version 2.0,MMS Version 8.1.3.0

channel ch1: finished piece 1 at 17-FEB-2013 01:24:20

channel ch1: backup piece complete, elapsed time: 00:00:07

Finished backup at 17-FEB-2013 01:24:20

Starting Control File and SPFILE Autobackup at 17-FEB-2013 01:24:21

piece handle=c-2590844408-20130217-03 comment=API Version 2.0,MMS Version 8.1.3.0

Finished Control File and SPFILE Autobackup at 17-FEB-2013 01:24:36

released channel: ch1

MONITORING BACKUPS

Once you have launched a backup, you can monitor the progress of this backup with the V$SESSION, V$PROCESS and V$SESSION_LONGOPS in the target database.

[email protected] [TESTDB1] /home/oracle/scripts

oracle $ cat monitor.sql

COLUMN CLIENT_INFO FORMAT a30

COLUMN SID FORMAT 999

COLUMN SPID FORMAT 9999

SELECT SID, SPID, CLIENT_INFO

FROM V$PROCESS p, V$SESSION s

WHERE p.ADDR = s.PADDR

AND CLIENT_INFO LIKE '%Backup_Session%';

SET PAGESIZE 9999 LINESIZE 200

COL OPNAME FORMAT A40

COL TARGET FORMAT A15

COL UNITS FORMAT A10

COL TIME_REMAINING FORMAT 99990.99 HEADING REMAINING[S]

COL BPS FORMAT 9990.99 HEADING [UNITS/S]

COL FERTIG FORMAT 90.99 HEADING "COMPLETE[%]"

SELECT SID, OPNAME, TARGET, SOFAR, TOTALWORK, UNITS,

(TOTALWORK-SOFAR)/TIME_REMAINING BPS, (TIME_REMAINING/3600) TIME_REMAINING,

SOFAR/TOTALWORK*100 FERTIG

FROM V$SESSION_LONGOPS

WHERE SID=&SID_NUMBER;

TESTDB1> @monitor.sql

SID SPID CLIENT_INFO

---- ------------------------ ------------------------------

69 20654 id=Backup_Session

Enter value for sid_number: 69

old 11: WHERE sid in (&sid_number)

new 11: WHERE sid in (69)

13 Session # 196

Page 14: How to survive a disaster with RMAN

Backup and Recovery

SID OPNAME TARGET SOFAR TOTALWORK UNITS [Units/s] Remaining[s] complete[%]

---- ---------------------------------------- --------------- ---------- ---------- ---------- ----------- ------------ -----------

69 RMAN: aggregate input 33 658909 7666654 Blocks 34691.81 0.06 8.59

When doing a backup to SBT, you can monitor the SBT with the following SQL query:

TESTDB1 >COLUMN EVENT FORMAT a17

TESTDB1 >COLUMN SECONDS_IN_WAIT FORMAT 999

TESTDB1 >COLUMN STATE FORMAT a15

TESTDB1 >COLUMN CLIENT_INFO FORMAT a30

TESTDB1 >

TESTDB1 >SELECT p.SPID, s.EVENT, sw.SECONDS_IN_WAIT AS SEC_WAIT,

2 sw.STATE, CLIENT_INFO

3 FROM V$SESSION_WAIT sw, V$SESSION s, V$PROCESS p

4 WHERE sw.EVENT LIKE '%MML%'

5 AND s.SID=sw.SID

6 AND s.PADDR=p.ADDR;

SPID EVENT SEC_WAIT STATE CLIENT_INFO

------------------------ ----------------- ---------- --------------- --------------

7037 Backup: MML query 147 WAITING rman channel=CH5

backup piece

BACKUP REPORTS Now that you have taken a backup, how do you know which backups you have and which datafiles need backup based on your retention period? RMAN provides two commands to do just this LIST and REPORT.

The LIST command displays backups and information about other objects recorded in the RMAN repository.

• LIST EXPIRED COPY;

• LIST EXPIRED BACKUP;

• LIST BACKUP SUMMARY;

• LIST BACKUP TAG TAG20130211T035650;

• LIST RECOVERABLE BACKUP OF DATABASE;

• LIST BACKUP OF DATAFILE 1 COMPLETED BETWEEN '03-FEB-2013' and '11-FEB-2013’;

• LIST INCARNATION;

In the example below you can see that Backup with key 58 is a Full backup that is present in both SBT and Disk, backup with key 60 is a backup set that contains archived redo logs, backup with key 62 is a Level 0 backup and backup with key 65 is a Level 1 backup belonging to an incremental backup strategy.

RMAN> LIST BACKUP SUMMARY;

List of Backups===============Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag

14 Session # 196

Page 15: How to survive a disaster with RMAN

Backup and Recovery

------- -- -- - ----------- -------------------- ------- ------- ---------- ---...58 B F A * 11-FEB-2013 10:10:08 1 2 YES TAG20130211T00345759 B F A * 11-FEB-2013 10:10:08 1 2 YES TAG20130211T00345760 B A A * 11-FEB-2013 10:10:08 1 2 YES TAG20130211T00352761 B A A DISK 11-FEB-2013 12:04:55 1 1 NO TAG20130211T03552262 B 0 A DISK 11-FEB-2013 12:04:55 1 1 NO TAG20130211T03552463 B 0 A DISK 11-FEB-2013 12:04:55 1 1 NO TAG20130211T03552464 B A A DISK 11-FEB-2013 12:04:55 1 1 NO TAG20130211T03555465 B 1 A DISK 11-FEB-2013 12:04:55 1 1 NO TAG20130211T035609…99 B F A DISK 13-FEB-2013 08:53:53 1 1 YES FULL_BACKUP101 B F A DISK 13-FEB-2013 08:53:59 1 1 YES FULL_BACKUP102 B A A DISK 13-FEB-2013 08:54:04 1 1 YES ARCH_BACKUP103 B F A DISK 13-FEB-2013 08:54:12 1 1 NO CTL_BACKUP104 B F A DISK 13-FEB-2013 08:54:17 1 1 NO TAG20130213T085412

The REPORT command provides certain information on database backups, such as, which files need a backup? Which backups are obsolete and can be deleted? Which files have not been backed up recently?

• REPORT NEED BACKUP;

• REPORT SCHEMA;

• REPORT NEED BACKUP RECOVERY WINDOW OF 2 DAYS;

• REPORT OBSOLETE;

• REPORT SCHEMA AT TIME 'SYSDATE-7'; (Only with recovery catalog)

In the example below you can see which data files in your database require a backup based on your retention policy

RMAN> REPORT NEED BACKUP;RMAN retention policy will be applied to the commandRMAN retention policy is set to redundancy 1

Report of files with less than 1 redundant backupsFile #bkps Name---- ----- -----------------------------------------------------6 0 +DB_DATA/testdb/datafile/rene.269.807120841

RMAN> REPORT NEED BACKUP REDUNDANCY 3;Report of files with less than 3 redundant backupsFile #bkps Name---- ----- -----------------------------------------------------1 8 +DB_DATA/testdb/datafile/system.260.7982308292 7 +DB_DATA/testdb/datafile/sysaux.257.7982308373 7 +DB_DATA/testdb/datafile/undotbs1.256.7982308414 7 +DB_DATA/testdb/datafile/users.264.7982308455 7 +DB_DATA/testdb/datafile/undotbs2.268.7982342456 0 +DB_DATA/testdb/datafile/rene.269.807120841

RESTORING AND RECOVERING FROM FAILURE

Restoring data files is retrieving them from a valid backup and putting them in a disk location, this can be the same place where it resided before the failure or a different location. Media recovery is the application of changes from redo logs and incremental backups to a restored data file to bring the data file forward to a desired SCN or point in time. When doing media recovery, there are two types of recovery depending on the backup you took:

15 Session # 196

Page 16: How to survive a disaster with RMAN

Backup and Recovery

• Complete recovery. - All changes in the redo logs are applied, this is only takes place when recovering from a consistent backup (Cold Backup)

• Incomplete recovery. - Only changes up to a specified point in time are applied, this is normally done when recovering from an inconsistent backup (Hot Backup). There are three ways to set the time or SCN until you want to do your media recovery:

o SCN

o Time

o Sequence

In case of a disaster, how would you even know which backup is needed for your restore and recovery operations, RMAN has an option with the RESTORE command to preview the needed backups without actually doing the restore, it will help you see if you have the appropriate backup to be able to do your actual restore.

RMAN> RUN2> {3> SET UNTIL SEQUENCE 66;4> ALLOCATE CHANNEL CH1 DEVICE TYPE DISK ;5> RESTORE DATABASE PREVIEW SUMMARY;6> }

executing command: SET until clauseusing target database control file instead of recovery catalogallocated channel: ch1channel ch1: SID=78 instance=TESTDB1 device type=DISK

Starting restore at 11-FEB-2013 08:44:32datafile 6 will be created automatically during restore operation

List of Datafile Copies=======================

Key File S Completion Time Ckp SCN Ckp Time ------- ---- - -------------------- ---------- --------------------

16 1 A 11-FEB-2013 09:54:07 918627 11-FEB-2013 09:54:07 Name: /mount/copy01/TESTDB/TESTDB_2581526535_75_20130211_807098047 Tag: TAG20130211T000728…17 2 A 11-FEB-2013 09:54:07 918633 11-FEB-2013 09:54:07 Name: /mount/copy01/TESTDB/TESTDB_2581526535_76_20130211_807098047 Tag: TAG20130211T00072818 3 A 11-FEB-2013 09:54:07 918642 11-FEB-2013 09:54:07 Name: /mount/copy01/TESTDB/TESTDB_2581526535_77_20130211_807098047 Tag: TAG20130211T000728…List of Backups===============Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag------- -- -- - ----------- -------------------- ------- ------- ---------- ---69 B A A DISK 11-FEB-2013 06:39:57 1 1 YES TAG20130211T063810Media recovery start SCN is 918621Recovery must be done beyond SCN 1063385 to clear datafile fuzzinessFinished restore at 11-FEB-2013 08:44:36released channel: ch1

The RESTORE command in RMAN has a VALIDATE option to help you corroborate if your backups are not corrupt. This should be part of your backup strategy, checking periodically whether you have an integral set of backups that can meet your recoverability objectives. RMAN reads the selected backups in their entirety to confirm that they are not corrupt, but does not produce output files. This will never replace the real restore/recovery scenario, but it’s as close as you can get.

16 Session # 196

Page 17: How to survive a disaster with RMAN

Backup and Recovery

• RESTORE DATABASE VALIDATE;

• RESTORE CONTROLFILE VALIDATE;

• RESTORE ARCHIVELOG FROM TIME 'SYSDATE-nn' VALIDATE;

RMAN> RUN{ 2> SET UNTIL SEQUENCE 87;3> ALLOCATE CHANNEL CH1 DEVICE TYPE DISK ;4> RESTORE DATABASE VALIDATE;5> RESTORE CONTROLFILE VALIDATE;6>} executing command: SET until clause…

Starting restore at 11-FEB-2013 21:50:47channel ch1: starting validation of datafile backup setchannel ch1: reading from backup piece /mount/copy01/TESTDB/TESTDB_2581526535_115_20130211_807139642channel ch1: piece handle=/mount/copy01/TESTDB/TESTDB_2581526535_115_20130211_807139642 tag=TAG20130211T212722channel ch1: restored backup piece 1…channel ch1: piece handle=/mount/copy01/TESTDB/TESTDB_2581526535_116_20130211_807139687 tag=TAG20130211T212722channel ch1: restored backup piece 1channel ch1: validation complete, elapsed time: 00:00:01Finished restore at 11-FEB-2013 21:51:24released channel: ch1

As we have talked, the great importance of knowing your DBID is when you are faced with the total loss of your control files. In a case like that, you would need to set your DBID so that you can recover your control files, be able to recover your database and open your database, as you did a recovery you would need to open your database with the resetlogs option and would create a new incarnation of your database.

TESTDB1> startup

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size 2235208 bytes

Variable Size 696255672 bytes

Database Buffers 364904448 bytes

Redo Buffers 5541888 bytes

ORA-00205: error in identifying control file, check alert log for more info

[email protected] [TESTDB1] /mount/oracle/dump01/TESTDB/diag/rdbms/testdb/TESTDB1/trace

oracle $ tail -15 alert_TESTDB1.log

SUCCESS: diskgroup DB_DATA was mounted

Sun Feb 17 17:09:32 2013

ORA-00210: cannot open the specified control file

ORA-00202: control file: '+DB_DATA/testdb/controlfile/control02.ctl'

ORA-17503: ksfdopn:2 Failed to open file +DB_DATA/testdb/controlfile/control02.ctl

ORA-15173: entry 'controlfile' does not exist in directory 'testdb'

ORA-00210: cannot open the specified control file

ORA-00202: control file: '+DB_DATA/testdb/controlfile/control01.ctl'

ORA-17503: ksfdopn:2 Failed to open file +DB_DATA/testdb/controlfile/control01.ctl

17 Session # 196

Page 18: How to survive a disaster with RMAN

Backup and Recovery

ORA-15173: entry 'controlfile' does not exist in directory 'testdb'

ORA-205 signalled during: ALTER DATABASE MOUNT...

Sun Feb 17 07:09:33 2013

Checker run found 2 new persistent data failures

SUCCESS: diskgroup DB_DATA was dismounted

[email protected] [TESTDB1] /mount/oracle/dump01/TESTDB/diag/rdbms/testdb/TESTDB1/trace

oracle $ rman target /

connected to target database (not started)

RMAN> set DBID=2590844408;

executing command: SET DBID

RMAN> startup nomount;

Oracle instance started

Total System Global Area 1068937216 bytes

Fixed Size 2235208 bytes

Variable Size 708838584 bytes

Database Buffers 352321536 bytes

Redo Buffers 5541888 bytes

RMAN> restore controlfile from autobackup;

Starting restore at 17-FEB-2013 07:13:40

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=30 device type=DISK

recovery area destination: /mount/oracle/copy01/flash_recovery_area/TESTDB

database name (or database unique name) used for search: TESTDB

channel ORA_DISK_1: AUTOBACKUP /mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/autobackup/2013_02_16/o1_mf_s_807512860_8kypmryz_.bkp found in the recovery area

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20130217

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20130216

channel ORA_DISK_1: restoring control file from AUTOBACKUP /mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/autobackup/2013_02_16/o1_mf_s_807512860_8kypmryz_.bkp

channel ORA_DISK_1: control file restore from AUTOBACKUP complete

output file name=+DB_DATA/testdb/controlfile/control01.ctl

output file name=+DB_DATA/testdb/controlfile/control02.ctl

Finished restore at 17-FEB-2013 07:14:38

RMAN> alter database mount;

database mounted

released channel: ORA_DISK_1

RMAN> recover database;

Starting recover at 17-FEB-2013 07:16:10

Starting implicit crosscheck backup at 17-FEB-2013 07:16:11

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=35 device type=DISK

Finished implicit crosscheck backup at 17-FEB-2013 07:16:12

18 Session # 196

Page 19: How to survive a disaster with RMAN

Backup and Recovery

Starting implicit crosscheck copy at 17-FEB-2013 07:16:12

using channel ORA_DISK_1

Finished implicit crosscheck copy at 17-FEB-2013 07:16:13

searching for all files in the recovery area

cataloging files...

cataloging done

List of Cataloged Files

File Name: /mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/autobackup/2013_02_16/o1_mf_s_807512860_8kypmryz_.bkp

File Name: /mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/archivelog/2013_02_16/o1_mf_1_1_8kymdzom_.arc

File Name: /mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/archivelog/2013_02_16/o1_mf_1_2_8kymf16b_.arc

using channel ORA_DISK_1

starting media recovery

archived log for thread 1 with sequence 10 is already on disk as file +DB_DATA/testdb/onlinelog/group_2.261.807548767

archived log file name=+DB_DATA/testdb/onlinelog/group_2.261.807548767 thread=1 sequence=10

media recovery complete, elapsed time: 00:00:01

Finished recover at 16-FEB-13 07:17:27

RMAN> alter database open resetlogs;

database opened

RMAN> list incarnation;

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 TESTDB 2590844408 PARENT 1 01-NOV-2012 20:11:31

2 2 TESTDB 2590844408 PARENT 735231 16-FEB-2013 04:53:05

3 3 TESTDB 2590844408 CURRENT 796984 17-FEB-2013 07:17:43

If you were to lose every file from your database and you are not using a recovery catalog, you would need to manually identify which are the last backups you took, if you have been following this document, you would have named with a format of %d_DB_BU_%I_%s_%T_%t, which would allow you to quickly identify that this is a Database backup and %d_ARCH_BU_%I_%s_%T_%t to let you identify your archive backups and if you have autocontrol backup on, you see the last controlfile that was backed up as well.

RMAN> show CONTROLFILE AUTOBACKUP;

RMAN configuration parameters for database with db_unique_name TESTDB are:

CONFIGURE CONTROLFILE AUTOBACKUP ON;

RMAN> show CONTROLFILE AUTOBACKUP format;

RMAN configuration parameters for database with db_unique_name TESTDB are:

CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/mount/copy01/TESTDB/%F';

RMAN> BACKUP AS COMPRESSED BACKUPSET DATABASE FORMAT '/MOUNT/COPY01/TESTDB/%d_DB_BU_%I_%s_%T_%t' TAG 'FULL_BACKUP';

19 Session # 196

Page 20: How to survive a disaster with RMAN

Backup and Recovery

RMAN> BACKUP AS COMPRESSED BACKUPSET ARCHIVELOG ALL FORMAT '/MOUNT/COPY01/TESTDB/%d_ARCH_BU_%I_%s_%T_%t' TAG 'ARCH_BACKUP' DELETE INPUT;

With the above configuration settings and backup format, we can now restore and recover our database

[email protected] [TESTDB1] /mount/copy01/TESTDB

oracle $ ls -ltr | tail -15

-rw-r----- 1 oracle asmadmin 34799616 Feb 17 07:06 TESTDB_CTL_BU_2590844408_60_20130217_807642372

-rw-r----- 1 oracle asmadmin 34832384 Feb 17 07:06 c-2590844408-20130217-04

-rw-r----- 1 oracle asmadmin 34832384 Feb 17 07:48 c-2590844408-20130217-05

-rw-r----- 1 oracle asmadmin 19398656 Feb 17 07:53 TESTDB_DB_BU_2590844408_64_20130217_807609191

-rw-r----- 1 oracle asmadmin 44097536 Feb 17 07:53 TESTDB_DB_BU_2590844408_65_20130217_807609191

-rw-r----- 1 oracle asmadmin 1089536 Feb 17 07:53 TESTDB_DB_BU_2590844408_66_20130217_807609218

-rw-r----- 1 oracle asmadmin 200704 Feb 17 07:53 TESTDB_ARCH_BU_2590844408_67_20130217_807609236

-rw-r----- 1 oracle asmadmin 61952 Feb 17 07:53 TESTDB_ARCH_BU_2590844408_68_20130217_807609237

-rw-r----- 1 oracle asmadmin 34799616 Feb 17 07:54 TESTDB_CTL_BU_2590844408_69_20130217_807609249

-rw-r----- 1 oracle asmadmin 34832384 Feb 17 07:54 c-2590844408-20130217-06

-rw-r----- 1 oracle asmadmin 1094656 Feb 17 07:59 TESTDB_ARCH_BU_2590844408_71_20130217_807609539

-rw-r----- 1 oracle asmadmin 34832384 Feb 17 07:59 c-2590844408-20130217-07

-rw-r----- 1 oracle asmadmin 34799616 Feb 17 07:59 TESTDB_CTL_BU_2590844408_73_20130217_807609579

-rw-r----- 1 oracle asmadmin 34832384 Feb 17 07:59 c-2590844408-20130217-08

[email protected] [TESTDB1] /home/oracle

oracle $ export ORA_RMAN_SGA_TARGET=1000

[email protected] [TESTDB1] /home/oracle

oracle $ rman target /

Recovery Manager: Release 11.2.0.3.0 - Production on Sun Feb 17 08:27:29 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved.

connected to target database (not started)

RMAN> set dbid=2590844408;

executing command: SET DBID

RMAN> startup nomount;

startup failed: ORA-01078: failure in processing system parameters

ORA-01565: error in identifying file '+DB_DATA/TESTDB/spfileTESTDB.ora'

ORA-17503: ksfdopn:2 Failed to open file +DB_DATA/TESTDB/spfileTESTDB.ora

ORA-15056: additional error message

ORA-17503: ksfdopn:2 Failed to open file +DB_DATA/testdb/spfiletestdb.ora

ORA-15173: entry 'spfiletestdb.ora' does not exist in directory 'testdb'

ORA-06512: at line 4

starting Oracle instance without parameter file for retrieval of spfile

Oracle instance started

Total System Global Area 1043886080 bytes

Fixed Size 2234960 bytes

Variable Size 276825520 bytes

Database Buffers 759169024 bytes

20 Session # 196

Page 21: How to survive a disaster with RMAN

Backup and Recovery

Redo Buffers 5656576 bytes

RMAN> restore spfile from autobackup;

Starting restore at 17-FEB-2013 08:29:39

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=25 device type=DISK

channel ORA_DISK_1: looking for AUTOBACKUP on day: 20130217

channel ORA_DISK_1: no AUTOBACKUP in 7 days found

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of restore command at 02/17/2013 08:29:40

RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece

RMAN> restore spfile from '/mount/copy01/TESTDB/c-2590844408-20130217-08';

Starting restore at 17-FEB-2013 08:30:04

using channel ORA_DISK_1

channel ORA_DISK_1: restoring spfile from AUTOBACKUP /mount/copy01/TESTDB/c-2590844408-20130217-08

channel ORA_DISK_1: SPFILE restore from AUTOBACKUP complete

Finished restore at 17-FEB-2013 08:30:07

RMAN> shutdown immediate;

Oracle instance shut down

RMAN> startup nomount;

connected to target database (not started)

Oracle instance started

Total System Global Area 1068937216 bytes

Fixed Size 2235208 bytes

Variable Size 708838584 bytes

Database Buffers 352321536 bytes

Redo Buffers 5541888 bytes

RMAN> restore controlfile from '/mount/copy01/TESTDB/c-2590844408-20130217-08';

Starting restore at 17-FEB-2013 08:31:25

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=30 device type=DISK

channel ORA_DISK_1: restoring control file

channel ORA_DISK_1: restore complete, elapsed time: 00:00:55

output file name=+DB_DATA/testdb/controlfile/control01.ctl

output file name=+DB_DATA/testdb/controlfile/control02.ctl

Finished restore at 17-FEB-2013 08:32:20

RMAN> alter database mount;

database mounted

released channel: ORA_DISK_1

RMAN> restore database;

Starting restore at 17-FEB-2013 08:33:17

21 Session # 196

Page 22: How to survive a disaster with RMAN

Backup and Recovery

channel ORA_DISK_1: starting datafile backup set restore

channel ORA_DISK_1: specifying datafile(s) to restore from backup set

channel ORA_DISK_1: restoring datafile 00002 to +DB_DATA/testdb/datafile/sysaux.257.798230837

channel ORA_DISK_2: restore complete, elapsed time: 00:02:58

Finished restore at 17-FEB-2013 08:37:02

RMAN> recover database;

Starting recover at 17-FEB-2013 08:37:12

starting media recovery

channel ORA_DISK_1: starting archived log restore to default destination

channel ORA_DISK_1: restoring archived log

archived log thread=1 sequence=1

channel ORA_DISK_1: reading from backup piece /mount/copy01/TESTDB/TESTDB_ARCH_BU_2590844408_68_20130217_807609237

channel default: deleting archived log(s)

archived log file name=/mount/oracle/copy01/flash_recovery_area/TESTDB/TESTDB/archivelog/2013_02_17/o1_mf_1_2_8l1q8rmj_.arc RECID=19 STAMP=807611849

unable to find archived log

archived log thread=1 sequence=3

RMAN-00571: ===========================================================

RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============

RMAN-00571: ===========================================================

RMAN-03002: failure of recover command at 02/17/2013 08:37:57

RMAN-06054: media recovery requesting unknown archived log for thread 1 with sequence 3 and starting SCN of 797548

RMAN> alter database open resetlogs;

database opened

RMAN> list incarnation;

List of Database Incarnations

DB Key Inc Key DB Name DB ID STATUS Reset SCN Reset Time

------- ------- -------- ---------------- --- ---------- ----------

1 1 TESTDB 2590844408 PARENT 1 01-NOV-2012 20:11:31

2 2 TESTDB 2590844408 PARENT 735231 16-FEB-2013 04:53:05

3 3 TESTDB 2590844408 CURRENT 796984 17-FEB-2013 07:17:43

4 4 TESTDB 2590844408 CURRENT 797549 17-FEB-2013 08:38:19

WHAT TO DO IF I DON’T KNOW WHAT TO DO?

Oracle has a utility within RMAN that is part of Oracle’s Fault Diagnosability Infrastructure, which automatically diagnoses data failures, determines and presents appropriate repair options, and executes repairs at the user's request. It can help you and guide you in diagnosing and repairing a media failure thus helping you reduce your MTTR.

The workflow begins when you either suspect or discover a failure

• LIST FAILURE. - Method to obtain information regarding failures if you suspect one has occurred.

22 Session # 196

Page 23: How to survive a disaster with RMAN

Backup and Recovery

• VALIDATE DATABASE. - If you suspect that a failure has occurred but not detected, this will help you validate corrupt blocks and missing data files.

• ADVISE FAILURE. - Will advise you on repair options based on open database failures

• REPAIR FAILURE. - Automatically will fix failures suggested in the most recent ADVISE FAILURE, you should first try any manual repairs before trying to automatically repair failures.

• Return to the first step if other failures exist

Database mounted.

ORA-01157: cannot identify/lock data file 6 - see DBWR trace file

ORA-01110: data file 6: '+DB_DATA/testdb/datafile/rene.269.807120841’

RMAN> validate database;

RMAN-03002: failure of validate command at 02/11/2013 23:31:38

RMAN-06056: could not access datafile 6

RMAN> list failure;

List of Database Failures

=========================

Failure ID Priority Status Time Detected Summary

---------- -------- --------- -------------------- -------

4222 HIGH OPEN 11-FEB-2013 23:31:02 One or more non-system datafiles are missing

RMAN> list failure 4222 detail;

Impact: See impact for individual child failures

List of child failures for parent failure ID 4222

Failure ID Priority Status Time Detected Summary

---------- -------- --------- -------------------- -------

4225 HIGH OPEN 11-FEB-2013 23:31:02 Datafile 6: '+DB_DATA/testdb/datafile/rene.269.807120841' is missing

Impact: Some objects in tablespace RENE might be unavailable

RMAN> advise failure 4222;

Optional Manual Actions

=======================

1. If file +DB_DATA/testdb/datafile/rene.269.807120841 was unintentionally renamed or moved, restore it

Automated Repair Options

========================

Option Repair Description

------ ------------------

1 Restore and recover datafile 6

Strategy: The repair includes complete media recovery with no data loss

Repair script: /mount/oracle/dump01/TESTDB/diag/rdbms/testdb/TESTDB1/hm/reco_3903715129.hm

RMAN> repair failure;

contents of repair script:

23 Session # 196

Page 24: How to survive a disaster with RMAN

Backup and Recovery

# restore and recover datafile

restore datafile 6;

recover datafile 6;

sql 'alter database datafile 6 online';

Starting restore at 11-FEB-2013 23:37:06

using channel ORA_DISK_1

Finished restore at 11-FEB-2013 23:37:14

Starting recover at 11-FEB-2013 23:37:14

using channel ORA_DISK_1

starting media recovery

Finished recover at 11-FEB-2013 23:37:16

sql statement: alter database datafile 6 online

repair failure complete

Do you want to open the database (enter YES or NO)? yes

database opened

FLASHBACK TECHNOLOGY

Even though flashback features are not part of the RMAN utility, this Oracle element will help you to reduce your MTTR in case of human error, which normally is not a media failure and not detected as a database error, for example dropping a table.

There are four main methods of the flashback feature

• Flashback Query

Provides the ability to view the data as it existed in the past by using undo segments to obtain metadata and historical data for transactions.

TESTDB1> select first_name,last_name from employees as of timestamp

2 to_timestamp('2013-02-11 18:15:00','YYYY-MM-DD HH24:MI:SS')

3 where EMPLOYEE_ID=100;

FIRST_NAME LAST_NAME

-------------------- -------------------------

Steven King

• Flashback Table

Recovers a table to a previous point in time. Both Flashback Query and Table use undo segments.TESTDB1> SELECT current_scn FROM v$database;

CURRENT_SCN

-----------

800885

24 Session # 196

Page 25: How to survive a disaster with RMAN

Backup and Recovery

TESTDB1> select count(1) from test.rene;

COUNT(1)

----------

16

TESTDB1> delete from test.rene;

16 rows deleted.

TESTDB1> commit;

Commit complete.

TESTDB1> select count(1) from test.rene;

COUNT(1)

----------

0

TESTDB1> FLASHBACK TABLE test.rene TO SCN 800885;

Flashback complete.

TESTDB1> select count(1) from test.rene;

COUNT(1)

----------

16

• Flashback Drop

Virtual container where all dropped objects reside. When you drop a table, it is automatically placed into the Recycle Bin if it is turned on with the parameter RECYCLEBIN=ON

TESTDB1> show parameter recyclebin

NAME TYPE VALUE

------------------------------------ ----------- ------------------------------

recyclebin string on

TESTDB1> drop table test.rene;

Table dropped.

TESTDB1> select count(1) from test.rene;

select count(1) from test.rene

ERROR at line 1:

ORA-00942: table or view does not exist

TESTDB1> SELECT OBJECT_NAME AS RECYCLE_NAME, ORIGINAL_NAME, TYPE,DROPTIME

2 FROM RECYCLEBIN WHERE ORIGINAL_NAME='RENE';

RECYCLE_NAME ORIGINAL_NAME TYPE DROPTIME

------------------------------ -------------------------------- ------------------------- -------

25 Session # 196

Page 26: How to survive a disaster with RMAN

Backup and Recovery

BIN$1e4zkkN+MvfgQ0c4qMAGZQ==$0 RENE TABLE 2013-02-17:10:24:15

BIN$1e5c/ckXM1rgQ0c4qMAdIA==$0 RENE TABLE 2013-02-17:10:35:51

TESTDB1> FLASHBACK TABLE "BIN$1e5c/ckXM1rgQ0c4qMAdIA==$0" TO BEFORE DROP;

Flashback complete.

TESTDB1> select count(1) from test.rene;

COUNT(1)

----------

16

• Flashback Database

Provides a more efficient alternative to database point-in-time recovery, this flashback feature is very useful when you are doing upgrades or have the need to restore the database to a point in time without doing a media recovery. It uses Flashback Logs so the FRA and DB_FLASHBACK_RETENTION_TARGET have to be set, also you will also have to turn it on with the database in ARCHIVELOG mode.

There are some restrictions to using this feature:

o No current data files are lost or damaged. You can only use FLASHBACK DATABASE to rewind changes to a data file made by an Oracle database, not to repair media failures.

o You are not trying to recover from accidental deletion of data files, undo a shrink data file operation, or undo a change to the database name.

o You are not trying to use FLASHBACK DATABASE to return to a point in time before the restore or re-creation of a control file.

o You are not trying to use FLASHBACK DATABASE to undo a compatibility change.

TESTDB1> select FLASHBACK_ON from v$database;

FLASHBACK_ON

------------------

NO

TESTDB1> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

TESTDB1> startup mount;

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size 2235208 bytes

Variable Size 708838584 bytes

Database Buffers 352321536 bytes

Redo Buffers 5541888 bytes

Database mounted.

TESTDB1> alter database flashback on;

Database altered.

26 Session # 196

Page 27: How to survive a disaster with RMAN

Backup and Recovery

TESTDB1> alter database open;

Database altered.

TESTDB1> select flashback_on,log_mode from v$database;

FLASHBACK_ON LOG_MODE

------------------ ------------

YES ARCHIVELOG

TESTDB1> CREATE RESTORE POINT before_damage;

Restore point created.

TESTDB1> drop user RENE cascade;

User dropped.

TESTDB1> select count(1) from dba_users where username='RENE';

COUNT(1)

----------

0

TESTDB1> shutdown immediate

Database closed.

Database dismounted.

ORACLE instance shut down.

TESTDB1> startup mount

ORACLE instance started.

Total System Global Area 1068937216 bytes

Fixed Size 2235208 bytes

Variable Size 708838584 bytes

Database Buffers 352321536 bytes

Redo Buffers 5541888 bytes

Database mounted.

TESTDB1> flashback database to restore point before_damage;

Flashback complete.

TESTDB1> alter database open resetlogs;

Database altered.

TESTDB1> select count(1) from dba_users where username='RENE';

COUNT(1)

----------

1

27 Session # 196

Page 28: How to survive a disaster with RMAN

Backup and Recovery

CONCLUSION

If the time ever comes when you need to recover from a disaster, it is very critical knowing the DBID of your database; you can always get the DBID from

• Backup logs

• RMAN Catalog

• Backup Pieces if formatted to have it in the backup piece

Get to know your RMAN configurations and where your backups reside, SBT or DISK. Always have a backup of your ORACLE_HOME/GRID_HOME to the last current patch applied, and last but not least have as many Level 0 backups as your retention period/storage capacities allow.

Knowing and having these settings/configurations will allow you to reduce your MTTR and reduce the costs of having the data unavailable for your organization or client.

REFERENCES a. Oracle® Database Backup and Recovery User's Guide 11g Release 2 (11.2)

from http://docs.oracle.com/cd/E11882_01/backup.112/e10642/toc.htm

b. Oracle RMAN 11g Backup and RecoveryRobert G. Freeman & Matthew Hart (2010) McGraw-Hill

c. RMAN Recipes for Oracle Database 11g:A Problem-Solution ApproachDarl Kuhn, Sam Alapati, & Arup Nanda (2007) Apress

d. 10 Problems with your backup scriptfrom http://www.slideshare.net/yvelikanov/10-problems-with-your-rman-backup-script

e. Don’t Forget the Basicsfrom http://www.pythian.com/blog/wp-content/uploads/Michael_Abbey_OOW_2012_UGF6458.pdf

f. RMAN | Pythian - Data Experts Blogfrom http://www.pythian.com/blog/tag/rman/

28 Session # 196