catastrophic hardware failure & recovery with exchange server 2003 eileen brown...

27
Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown ([email protected]) IT Evangelist Microsoft UK http://blogs.msdn.com/eileen_brown

Upload: buck-moore

Post on 31-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Catastrophic Hardware Failure & Recovery with Exchange Server 2003

Eileen Brown ([email protected])IT EvangelistMicrosoft UK

http://blogs.msdn.com/eileen_brown

Page 2: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Topics

• What’s new in Exchange 2003 and Windows 2003• Disaster Recovery Questionnaire• Active Directory Overview and Disaster Recovery• Exchange 2003 Overview and Disaster Recovery• Database Disaster Recovery

Page 3: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

What’s New In Exchange 2003

• Database snapshot through Volume Shadow Copy Services

• Recovery Storage Group• RPC/HTTP support for Outlook 2003• IPSec support between front-ends and back-end

clusters• IIS6 runs in Dedicated Mode • Clustering

Page 4: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Active Directory Database • Ntds.dit – the database• Edbxxxxx.log – transaction logs• Edb.chk – checkpoint file• Res1.log and Res2.log – reserved log files • Logs are of fixed size (10mb for AD)• Three categories of directory data are replicated

between domain controllers:– Domain data (accounts…)– Configuration data (list of domains…)– Schema data (definition of all objects…)

Page 5: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Active Directory Backup• System State Components:

– System Start-up Files (boot files)– System registry – Class registration database of COM+– SYSVOL

• What Is A Good Backup?– System State, system disk contents, and the SYSVOL folder – Consider tombstone age set in Active Directory

• Default is 60 days– If data older than the tombstone lifetime - restore disallowed– Backup data from a DC can only be used to restore that DC

Page 6: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Types Of Disaster• Determine the type of disaster

– Database corruption• Damaged disks • DC hardware failure • Software failure – server cannot boot

– Data corruption• Accidentally deleted object from directory

• Methods to restore Windows 2003 DC: – Re-installation– Backup

Page 7: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Restore Through Re-Installation

• New DC receives the same name as failed DC: – Remove the ntdsDSA object of the failed DC

using ntdsutil

• Use ntdsutil “metadata cleanup” command – connect to the remote DC– remove orphaned DC

Page 8: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Restore From Backup

• Non-Authoritative Restore• Default method for the restoration of

Active Directory

• DC is then updated using normal replication techniques

• Authoritative Restore• ntdsutil

Page 9: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Authoritative Restore• Follow non-authoritative restore before

initiation

• object attributes version number Incremented– entire directory – subtree – individual object

• Used when human error is involved– Accidentally deleted a number of objects which

cannot be recreated easily

Page 10: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Recovering A Global Catalog Server

• Restore from backup or:• Add additional GC• Create branch office replica from media - dcpromo

/adv• Restore GC onto different hardware - issues

– Different HALs– Incompatible Boot.ini file – Different network or video cards

Page 11: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

AD Forest Recovery - High Level Steps • Identify single DC for restore• Shut down ALL DC’s• Recover first DC in root domain

– 1. Primary SYSVOL restore, disable GC flag– 2. Configure DNS– 3. Raise value of RID pool by 100,000

• cn=RID Manager$,cn=System,dc=<domain name> – 4. Seize all (FSMO) roles (ntdsutil)– 5. Clean metadata of ALL DC’s in the root (ntdsutil)

Page 12: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

• Recover FIRST DC in the root domain (cont.)– 6. Delete server and computer objects of all other DC– 7. Reset the computer account of the DC twice (netdom)– 8. Reset the krbtgt password twice (ADUC)– 9. Reset the trust password twice (netdom)

• Restore FIRST DC in each other remaining domains– Primary SYSVOL restore for domain– Same steps as previously (domain wide)– Enable GC flag– DO FRESH BACKUP– Install other DC’s using dcpromo

AD Forest Recovery - High Level Steps

Page 13: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

• White paper http://download.microsoft.com/download/win2000srv/Utility/1.001/NT5/EN-US/forestrecovery.exe

• AD Fast recovery (VSS) – white paper available

AD Forest Recovery - High Level Steps

Page 14: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Where Is Exchange Information Stored?

• Registry settings and metabase – System state backup

• AD Directory Objects store “Recipient” information• Users, Groups, and Contacts.• Replicated to GCs• Most Exchange information placed on existing objects

are replicated between Global Catalogs

• AD Configuration • Exchange System Objects • Public Folder Directory entries• Active Directory Connector (ADC) settings

Page 15: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Levels Of Disaster Recovery• Restoring mailboxes

– Recovery Storage Group / Separate server / 3rd party backup utility

• Restoring one or more Exchange databases– Backup software

• Restoring multiple databases - single storage group– Backup software

• Complete disaster - full server recoveries

Page 16: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Move Exchange To New Hardware (Exchange 2003 = GC)

• If server is a domain controller:• Deletion of computer account / NTDS Settings Object

– DCPROMO /FORCEREMOVAL – “NEW”

• Keeping the same server name – Take existing Exchange 2003 computer offline – Reset existing Exchange 2003 computer account – Bring the new computer online using same name– Log on using Exchange 2003 Full Administrator account– Exchange 2003 Setup /disasterrecovery– Mount stores - check client connectivity and mail flow.

Page 17: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Using Exchange 2003 Stand-By Recovery Server• What you need

– System State backup– C:\Windows folder backup– Exchange 2003 database backups

• Steps to recover– Start stand-by server– Restore %SystemRoot% folder and System State– Run Exchange 2003 setup in disaster recovery mode– Restore databases

• Recovery Using Images– Drive signature issue prevents logon after recovery

• Fix using Q249321 and Q223188

Page 18: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Recovery Storage Group • RSG per Server/ Information Store• Restore mailbox DBs from same SG• Restore SG/DBs from same AG• User mailboxes remain disconnected• Only MAPI protocol supported• Restores default into RSG• Active/Passive one restore storage group per EVS • ONE recovery storage group per cluster supported

Page 19: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Recovery Of Other Exchange 2003 Services

• Connectors– Lotus Notes– Novell GroupWise– Exchange Calendar Connector

• Custom OWA• Clusters

– Volume Mount Points– Majority Node Set (MNS) Clusters– Resource Kit clusdiag tool

Page 20: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Exchange 2003 Clustering• What to back up

– Cluster Administrative software– Quorum – System State

• Exchange 2003 Server Cluster Disaster Recovery types– Recover shared disk resource (Clusdb – Chkxxx.tmp Q224999)– Restore Quorum Resource– Replace a damaged node– Restore an entire Exchange 2003 cluster– Majority Node Set (MNS) Cluster, ASR for cluster– Windows 2000 to Windows Server 2003 rolling upgrades supported – Support for Mount Points

Page 21: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

ASR For Clusters• Automated System Recovery – ASR can

completely restore a cluster in a variety of scenarios, including – damaged or missing system files – complete OS reinstallation due to hardware failure – a damaged Cluster database, and – changed disk signatures (including shared)

Page 22: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Removing orphaned Exchange Server• Active Directory Sites and Services snap-in • Services: Microsoft Exchange: organisation_name:Administrative Groups: Servers

• Delete same named server object• If cluster is gone you cannot delete Exchange Virtual Server

resources from AD• Bind to DC using LDP:

– Configuration\Services\Microsoft Exchange\Organization\Administrative Group\Servers

• Right click: Delete orphan EVS entries • No option of Disaster Recovery Setup for EVS

Page 23: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Logical Versus Physical Corruption

• Three layers of corruption that can occur – Page level– ESE level– Store level

• To remove corruption– Restore an uncorrupted backup of the database– Repair the database– Expunge the corrupted pages from the database– Salvage data and generate a new database

Page 24: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Errors 1018 and 1019• Error 1018: JET_errReadVerifyFailure

– Bad checksum / Wrong page number

• Hardware / Firmware• File system corruption• How serious are 1018 Errors?

– During normal operation (somewhat serious)– During startup (likely fatal)– During backup (may be minor)

• Error 1019: JET_errPageNotInitialized• What causes Error 1019?

– Special case of error 1018 (page is replaced with zeroes)– Bad page links

Page 25: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Errors 1022 and 1216• Error 1022: JET_errDiskIO

– Disk I/O failure– File damage or truncation– File locked by another process– Anti-virus software

• Error 1216 (Q296843) files in the database's running set are missing or have been replaced – When storage group starts system analyses header information

• If logs are missing:– Restore the database from backup– Repair the database by using

• ESEUTIL /P followed by • ESEUTIL /D and • ISINTEG -fix• Q296843 – more details

Page 26: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

Conclusion• Review your disaster recovery plan when upgrading /

deploying Exchange 2000/2003• Backup all data needed for full recovery• Verify disaster recovery and restore plans through drills• Read Exchange 2003 mailbox and disaster recovery

whitepapers regularly • Audit your Best Practices• Request Microsoft PSS Operations Assessment

Page 27: Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown (eileenb@microsoft.com) IT Evangelist Microsoft UK

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only.© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only.MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.