ece590-03 enterprise storage architecture fall...

73
ECE590-03 Enterprise Storage Architecture Fall 2017 Business Continuity: Disaster Recovery Tyler Bletsch Duke University Includes material adapted from the course “Information Storage and Management v2” (modules 9-12), published by EMC corporation.

Upload: others

Post on 30-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

ECE590-03 Enterprise Storage Architecture

Fall 2017

Business Continuity: Disaster Recovery Tyler Bletsch

Duke University

Includes material adapted from the course “Information Storage and Management v2” (modules 9-12), published by EMC corporation.

Page 2: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

BC Terminologies – 1

• Disaster recovery

Coordinated process of restoring systems, data, and infrastructure required to support business operations after a disaster occurs

Restoring previous copy of data and applying logs to that copy to bring it to a known point of consistency

Generally implies use of backup technology

• Disaster restart

Process of restarting business operations with mirrored consistent copies of data and applications

Generally implies use of replication technologies

Module 9: Introduction to Business Continuity 2

Page 3: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

BC Terminologies – 2

Module 9: Introduction to Business Continuity 3

Recovery-Point Objective (RPO)

• Point-in-time to which systems and data must be recovered after an outage

• Amount of data loss that a business can endure

Recovery-Time Objective (RTO)

• Time within which systems and applications must be recovered after an outage

• Amount of downtime that a business can endure and survive

Recovery-point objective Recovery-time objective

Seconds

Minutes

Hours

Days

Weeks

Seconds

Minutes

Hours

Days

Weeks Tape Restore

Disk Restore

Manual Migration

Global Cluster

Tape Backup

Periodic Replication

Asynchronous Replication

Synchronous Replication

Page 4: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

4

RPO vs RTO

Page 5: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

BC Planning Lifecycle

Module 9: Introduction to Business Continuity 5

Establishing Objectives

Analyzing

Designing and Developing

Implementing

Training, Testing,

Assessing, and

Maintaining

Page 6: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Business Impact Analysis

• Identifies which business units and processes are essential to the survival of the business

• Estimates the cost of failure for each business process

• Calculates the maximum tolerable outage and defines RTO for each business process

• Businesses can prioritize and implement countermeasures to mitigate the likelihood of such disruptions

Module 9: Introduction to Business Continuity 6

Page 7: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

BC Technology Solutions

• Solutions that enable BC are:

Resolving single points of failure

Multipathing software

Backup and replication

Backup

Local replication

Remote replication

Module 9: Introduction to Business Continuity 7

Page 8: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Single Points of Failure

Module 9: Introduction to Business Continuity 8

It refers to the failure of a component of a system that can terminate the availability of the entire system or IT service.

Single Points of Failure

Client IP Switch

Server

FC Switch

Storage Array

Array port

Page 9: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Resolving Single Points of Failure

Module 9: Introduction to Business Continuity 9

Redundant Network

Production Storage Array

Remote Storage Array

HBA

HBA

Clustered Servers

Client

Redundant FC Switches

Redundant Arrays

Redundant Paths Redundant Ports

HBA

HBA

IP

NIC

NIC

NIC

NIC NIC Teaming

Page 10: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Multipathing Software

• Recognizes and utilizes alternate I/O path to data

• Provides load balancing by distributing I/Os to all available, active paths:

Improves I/O performance and data path utilization

• Intelligently manages the paths to a device by sending I/O down the optimal path:

Based on the load balancing and failover policy setting for the device

Module 9: Introduction to Business Continuity 10

Page 11: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

11

Backup and Archive

Page 12: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

12

Tyler’s Immutable Rules Of Backup A BACKUP SOLUTION MUST:

1. Record changes to data over time • If I just have the most recent copy, then I just have the most recently corrupted copy.

RESULT: MIRRORING ISN’T BACKUP!!!!

2. Have a copy at a separate physical location • If all copies are in one place, then a simple fire or lightning event can destroy all copies

3. Must be automatic • When you get busy, you’ll forget, and busy people make the most important data

4. Require separate credentials to access • If one compromised account can wipe primary and secondary,

then that account is a single point of failure

5. Be unwritable by anyone except the backup software (which ideally should live in the restricted backup environment) • If I can cd to a directory and change backups,

then the same mistake/attack that killed the primary can kill the backup

6. Reliably report on progress and alert on failure • I need to know if it stopped working or is about to stop working

7. Have periodic recovery tests to ensure the right data is being captured • Prevent “well it apparently hasn’t been backing up properly all along, so we’re screwed”

If you encounter backups that don’t meet these rules, explain the potential dangers until they do!

Page 13: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

What is Backup?

• Organization also takes backup to comply with regulatory requirements

• Backups are performed to serve three purposes:

Disaster recovery

Operational recovery

Archive

Module 10: Backup and Archive 13

It is an additional copy of production data that is created and retained for the sole purpose of recovering lost or corrupted data.

Backup

Page 14: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup Granularity

14 Module 10: Backup and Archive

Full Backup

Su Su Su Su Su

Incremental Backup

Su Su Su Su Su M T Th W F S M T Th W F S M T Th W F S M T Th W F S

Cumulative (Differential) Backup

Su Su Su Su Su M T Th W F S M T Th W F S M T Th W F S M T Th W F S

Amount of Data Backup

Page 15: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Restoring from Incremental Backup

15 Module 10: Backup and Archive

Incremental

Tuesday

File 4

Incremental

Wednesday

Updated File 3

Incremental

Thursday

File 5 Files 1, 2, 3, 4, 5

Production

Friday

Files 1, 2, 3

Monday

• Less number of files to be backed up, therefore, it takes less time to backup and requires less storage space

• Longer restore because last full and all subsequent incremental backups must be applied

Full Backup

Page 16: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Restoring from Cumulative Backup

16 Module 10: Backup and Archive

Cumulative

Tuesday

File 4 Files 1, 2, 3

Monday

Full Backup Cumulative

Wednesday

Files 4, 5

Cumulative

Thursday

Files 4, 5, 6 Files 1, 2, 3, 4, 5, 6

Production

Friday

• More files to be backed up, therefore, it takes more time to backup and requires more storage space

• Faster restore because only the last full and the last cumulative backup must be applied

Page 17: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

• Backup client Gathers the data that is to be

backed up and send it to storage node

• Backup server Manages backup operations

and maintains backup catalog

• Storage node Responsible for writing data to

backup device

Manages the backup device

Backup Architecture

Module 10: Backup and Archive

Storage Node Backup Device

Backup Client (Application Server)

Backup Data

Backup Server

Tracking Information

Backup Catalog

Backup Data

17

Page 18: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

18

Understanding this traditional model

Storage Node Backup Device

Backup Client (Application Server)

Backup Data

Backup Server

Tracking Information

Backup Catalog

Backup Data

Your storage server could be both of these

(assuming the onboard disks are your backup media)

A storage server could even be all three if index data is just kept with backup data

This could be disk or tape

Page 19: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup Operation

19 Module 10: Backup and Archive

1

Application Servers (Backup Clients)

Backup Server Storage Node Backup Device

2

7

3b 4

5 3a

6

3a Backup server instructs storage node to load backup media in backup device.

Backup server initiates scheduled backup process. 1

Backup server retrieves backup-related information from the backup catalog.

2

Backup server instructs backup clients to send data to be backed up to storage node.

3b

Backup clients send data to storage node and update the backup catalog on the backup server.

4

Storage node sends data to backup device. 5

Storage node sends metadata and media information to backup server.

6

Backup server updates the backup catalog. 7

Page 20: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Recovery Operation

20 Module 10: Backup and Archive

Application Servers (Backup Clients)

2

6

3

5

4

4

Backup Server Storage Node Backup Device

2 Backup server scans backup catalog to identify data to be restored and the client that will receive data.

3 Backup server instructs storage node to load backup media in backup device.

4 Data is then read and send to backup client.

5 Storage node sends restore metadata to backup server.

6 Backup server updates the backup catalog.

1

1 Backup client requests backup server for data restore.

Page 21: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup Methods

• Two methods of backup, based on the state of the application when the backup is performed

Hot or Online

Application is up and running, with users accessing their data during backup

Open file agent can be used to backup open files

Cold or Offline

Requires application to be shutdown during the backup process

• Bare-metal recovery

OS, hardware, and application configurations are appropriately backed up for a full system recovery

Server configuration backup (SCB) can also recover a server onto dissimilar hardware

Module 10: Backup and Archive 21

Page 22: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Server Configuration Backup

• Creates and backs up server configuration profiles, based on user-defined schedules

Profiles are used to configure the recovery server in case of production server failure

Profiles include OS configurations, network configurations, security configurations, registry settings, application configurations

• Two types of profiles used

Base profile

Contains the key elements of the OS required to recover the server

Extended profile

Typically larger than base profile and contains all necessary information to rebuild application environment

Module 10: Backup and Archive 22

Page 23: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

23

Modern virtual environment note

• In a modern cluster of hypervisors, you don’t worry so much about server configuration

• All servers are similar: they’re just dumb hosts for the hypervisor

• Virtual machines are the true unit of backup in this case

Figure from VMware docs here.

Page 24: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Key Backup/Restore Considerations

• Customer business needs determine:

What are the restore requirements – RPO & RTO?

Which data needs to be backed up?

How frequently should data be backed up?

How long will it take to backup?

How many copies to create?

How long to retain backup copies?

Location, size, and number of files?

Module 10: Backup and Archive 24

Page 25: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Module 10: Backup and Archive

During this lesson the following topics are covered:

• Common backup topologies

• Backup in NAS environment

Lesson 2: Backup Topologies and Backup in NAS Environment

Module 10: Backup and Archive 25

Page 26: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Direct-Attached Backup

26 Module 10: Backup and Archive

Backup Device Application Server/ Backup Client/ Storage Node

Backup Server

Metadata

Backup Data

LAN

Page 27: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

LAN-based Backup

27 Module 10: Backup and Archive

LAN

Storage Node Backup Device

Backup Data

Application Server/ Backup Client Backup Server

Metadata

Page 28: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

SAN-based Backup

28 Module 10: Backup and Archive

Backup Data Metadata

Backup Device Backup Server Application Server/ Backup Client

Storage Node

LAN FC SAN

Page 29: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Mixed Backup Topology

29 Module 10: Backup and Archive

Storage Node

Backup Data

Metadata

Backup Device Backup Server Application Server-1/ Backup Client

LAN FC SAN

Application Server-2/ Backup Client

Metadata

Page 30: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup in NAS Environment

• Common backup implementations in a NAS environment are:

Server-based backup

Serverless backup

NDMP 2-way backup

NDMP 3-way backup

Module 10: Backup and Archive 30

Page 31: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Server-based backup

31 Module 10: Backup and Archive

Metadata

Application Server/ Backup Client

NAS Head Backup Data

Backup Server/ Storage Node

Storage Array

Backup Device

FC SAN LAN

Page 32: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Serverless Backup

32 Module 10: Backup and Archive

Application Server

NAS Head

Backup Data

Backup Server/ Storage Node/ Backup Client

Storage Array

Backup Device

FC SAN LAN

Page 33: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

LAN

NDMP 2-way Backup

33 Module 10: Backup and Archive

Application Server/ Backup Client

NAS Head

Backup Data

Backup Server

Storage Array Backup Device

Metadata

FC SAN

Page 34: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

NDMP 3-way Backup

34 Module 10: Backup and Archive

Application Server/ Backup Client

Backup Server

Storage Array

Backup Device

NAS Head

NAS Head

Metadata

Backup Data

FC SAN

Private LAN

FC SAN

LAN

Page 35: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

35

Backup consistency

• Assume live (“hot”) backup

• Is data crash-consistent, or can we do better?

• Quiesce: To make consistent at this time (quiescent).

• Tell the OS that you’re about to take a snapshot, request quiescence

• OS flushes all buffers and commits the journal, pauses all IO, says OK

• Take snapshot

• Allow OS to resume

• Base the backup (which takes longer) off this snapshot

• Resulting backup is OS consistent

• Can also be application-aware

• Same as above, but you tell the application to quiesce

• Requires backup-aware applications (e.g. Microsoft SQL Server, Oracle database, etc.)

• Resulting backups are application consistent

Page 36: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Module 10: Backup and Archive

During this lesson the following topics are covered:

• Backup to Tape

• Backup to Disk

• Backup to Virtual Tape

Lesson 3: Backup Targets

Module 10: Backup and Archive 36

Page 37: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup to Tape

• Traditionally low cost solution

• Tape drives are used to read/write data from/to a tape

• Sequential/linear access

• Multiple streaming to improve media performance

Writes data from multiple streams on a single tape

• Limitation of tape

Backup and recovery operations are slow due to sequential access

Wear and tear of tape

Shipping/handling challenges

Controlled environment is required for tape storage

Causes “shoe shining effect” or “backhitching”

Module 10: Backup and Archive 37

Page 38: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup to Disk

• Enhanced overall backup and recovery performance

Random access

• More reliable

• Can be accessed by multiple hosts simultaneously

Module 10: Backup and Archive 38

Source: EMC Engineering and EMC IT

0 10 20 30 40 50 60 70 80 90 100 120 110

Recovery Time in Minutes*

Tape Backup/Restore

Disk Backup/Restore

108 Minutes

24 Minutes

Typical Scenario:

800 users, 75 MB mailbox

60 GB database

Page 39: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup to Virtual Tape

• Disks are emulated and presented as tapes to backup software

• Does not require any additional modules or changes in the legacy backup software

• Provides better single stream performance and reliability over physical tape

• Online and random disk access

Provides faster backup and recovery

Module 10: Backup and Archive 39

Page 40: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Virtual Tape Library

40 Module 10: Backup and Archive

Backup Server/ Storage Node

Backup Clients

Emulation Engine

Vir

tual

Tap

e L

ibra

ry A

pp

lian

ce

Storage (LUNs)

FC SAN

LAN

Page 41: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup Target Comparison

Module 10: Backup and Archive 41

Tape Disk Virtual Tape

Offsite

Replication

Capabilities

No Yes Yes

Reliability No inherent protection

methods RAID, spare RAID, spare

Performance Low High High

Use Backup only Multiple (backup

and production) Backup only

Page 42: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

42

In defense of tape

• These slides omit a key features of tape that’s the reason it’s still not dead.

• You can stick a tape in a vault for 20 years and probably still read it. A tape can’t have a head crash, bad bearing, or flaky controller board.

• Tape is crazy expensive compared to most other backup techniques, but if you need extreme archival capability, it’s not wrong to use tape.

Page 43: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Module 10: Backup and Archive

During this lesson the following topics are covered:

• Deduplication overview

• Deduplication methods

• Deduplication implementations

• Key benefits of deduplication

Lesson 4: Data Deduplication

Module 10: Backup and Archive 43

Page 44: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Module 10: Backup and Archive

During this lesson the following topics are covered:

• Traditional backup approach

• Image-based backup

Lesson 5: Backup in Virtualized Environment

Module 10: Backup and Archive 44

Page 45: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Backup in Virtualized Environment Overview

• Backup options

Traditional backup approach

Image-based backup approach

Module 10: Backup and Archive 45

Page 46: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

• Backup agent on VM

Requires installing a backup agent on each VM running on a hypervisor

Can only backup virtual disk data

Does not capture VM files such as VM swap file, configuration file

Challenge in VM restore

• Backup agent on Hypervisor

Requires installing backup agent only on hypervisor

Backs up all the VM files

Traditional Backup Approaches

Module 10: Backup and Archive

= Backup Agent

Backup agent runs on each VM

Backup agent runs on Hypervisor

46

Page 47: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Image-based Backup

Module 10: Backup and Archive 47

• Creates a copy of the guest OS, its data, VM state, and configurations

The backup is saved as a single file – “image”

Mounts image on a proxy server

Offloads backup processing from the hypervisor

• Enables quick restoration of VM

Application Server

Storage

Proxy Server

Backup Device

Mo

un

t

Snapshots

Page 48: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

During this lesson the following topics are covered:

• Fixed content

• Data archive

• Archive solution architecture

Lesson 6: Data Archive

Module 10: Backup and Archive 48

Module 10: Backup and Archive

Page 49: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Fixed Content

• Fixed content is growing at more than 90% annually

Significant amount of newly created information falls into this category

New regulations require retention and data protection

Module 10: Backup and Archive 49

Examples of Fixed Content

Electronic Documents

• Contracts and claims

• Email attachments

• Financial spread sheets

• CAD/CAM designs

• Presentations

Digital Records

• Documents

• Checks, securities trades

• Historical preservation

• Photographs

• Personal/professional

• Surveys

• Seismic, astronomic, geographic

Rich Media

• Medical

• X-rays, MRIs, CT Scan

• Video

• News/media, movies

• Security surveillance

• Audio

• Voicemail

• Radio

Page 50: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Data Archive

• A repository where fixed content is stored

• Enables organizations retaining their data for an extended period of time in order to

Meet regulatory compliance

Plan new revenue strategies

• Archive can be implemented as

Online

Nearline

Offline

Module 10: Backup and Archive 50

Page 51: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Challenges of Traditional Archiving Solutions

• Both tape and optical are susceptible to wear and tear

Involve operational, management, and maintenance overhead

• Have no intelligence to identify duplicate data

Same content could be archived many times

• Inadequate for long-term preservation (years-decades)

• Unable to provide online and fast access to fixed content

Module 10: Backup and Archive 51

Page 52: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Content Addressed Storage – An Archival Solution

• Disk-based storage that has emerged as an alternative to traditional archiving solutions

• Provides online accessibility to archive data

• Enables organization to meet the required SLAs

• Provides features that are required for storing archive data

Content authenticity and content integrity

Location independence

Single-instance storage

Retention enforcement

Data protection

Module 10: Backup and Archive 52

Page 53: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Archiving Solution Architecture

53 Module 10: Backup and Archive

Email Server

File Server

Archiving Storage Device

Archiving Server

Archiving

Agent

Archiving

Agent

Page 54: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Use Case: Email Archiving

• Moves the emails from primary to archive storage, based on policy

• Saves space on primary storage

• Enables to retain emails in the archive for longer period to meet regulatory requirements

• Gives end users virtually unlimited mailbox space

• File archiving is another use case that benefits from an archival solution

Module 10: Backup and Archive 54

Page 55: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

55

Local replication

Page 56: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

What is Replication?

• Replication can be classified as

Local replication

Replicating data within the same array or data center

Remote replication

Replicating data at remote site

Module 11: Local Replication 56

It is a process of creating an exact copy (replica) of data.

Replication

Source Replica (Target)

REPLICATION

Page 57: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Uses of Local Replica

• Alternate source for backup

• Fast recovery

• Decision support activities

• Testing platform

• Data Migration

Module 11: Local Replication 57

Page 58: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

58

Why local replication?

• Remember my rules?

• Local replication is useful:

• Can have lower RPO/RTO

• Can be cheaper

• May be sufficient for non-critical workloads where data loss is survivable

• Local replication is useful but not sufficient for business-critical workloads!

Page 59: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Replica Characteristics

• Recoverability/Restartability

Replica should be able to restore data on the source device

Restart business operation from replica

• Consistency

Replica must be consistent with the source

• Choice of replica tie back into RPO

Point-in-Time (PIT)

Non-zero RPO

Continuous

Near-zero RPO

Module 11: Local Replication 59

Page 60: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

EMC Proven Professional

Understanding Consistency

• Consistency ensures the usability of replica

• Consistency can be achieved in various ways for file system and database

Module 11: Local Replication 60

Offline Online

File System

Unmount file system

Flushing host buffers

Database Shutdown database

a)Using dependent write I/O

principle

b)Holding I/Os to source before

creating replica

Page 61: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

71

Remote replication

Page 62: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

What is Remote Replication?

• Process of creating replicas at remote sites

Addresses risk associated with regionally driven outages

• Modes of remote replication

Synchronous

Asynchronous

72

Storage Array – Source site Storage Array – Remote site

REPLICATION

Module 12: Remote Replication

Page 63: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Synchronous Replication – 1

• A write is committed to both source and remote replica before it is acknowledged to the host

• Ensures source and replica have identical data at all times

Maintains write ordering

• Provides near-zero RPO

73 Module 12: Remote Replication

1

3

4

2

Data Write

Data Acknowledgment

Host

Target at Remote Site

Source

Page 64: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Synchronous Replication – 2

• Response time depends on bandwidth and distance

• Requires bandwidth more than the maximum write workload

• Typically deployed for distance less than 200 km (125 miles) between two sites

74 Module 12: Remote Replication

Time

Writes MB/s

Required bandwidth

Typical workload

Max

Page 65: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Asynchronous Replication – 1

• A write is committed to the source and immediately acknowledged to the host

• Data is buffered at the source and transmitted to the remote site later

• Finite RPO

Replica will be behind the source by a finite amount

75 Module 12: Remote Replication

1

4

2

3

Data Write

Data Acknowledgment

Host

Target

Source

Page 66: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Asynchronous Replication – 2

• RPO depends on size of buffer and available network bandwidth

• Requires bandwidth equal to or greater than average write workload

• Sufficient buffer capacity should be provisioned

• Can be deployed over long distances

76 Module 12: Remote Replication

Average

Time

Writes MB/s

Required bandwidth

Typical workload

Page 67: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Host-based Remote Replication

• Replication is performed by host-based software

• LVM-based replication

All writes to the source volume group are replicated to the target volume group by the LVM

Can be synchronous or asynchronous

• Log shipping

Commonly used in a database environment

All relevant components of source and target databases are synchronized prior to the start of replication

Transactions to source database are captured in logs and periodically transferred to remote host

78 Module 12: Remote Replication

Page 68: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Storage Array-based Remote Replication – 1

• Replication is performed by array-operating environment

• Three replication methods: synchronous, asynchronous, and disk buffered

• Synchronous

Writes are committed to both source and replica before it is acknowledged to host

• Asynchronous

Writes are committed to source and immediately acknowledged to host

Data is buffered at source and transmitted to remote site later

79 Module 12: Remote Replication

Page 69: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Storage Array-based Remote Replication – 2

80

Production host writes data to source device.

Data from local replica is transmitted to the remote replica at target.

A consistent PIT local replica of the source device is created.

Optionally a PIT local replica of the remote replica on the target is created.

Source Array Target Array

Local Replica Remote Replica

Local Replica

Production Host

Source Device

Module 12: Remote Replication

• Disk-buffered

Page 70: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Three-site Replication

• Data from source site is replicated to two remote sites

Replication is synchronous to one of the remote sites and asynchronous or disk buffered to the other remote site

• Mitigates the risk in two site replication

No DR protection after source or remote site failure

• Implemented in two ways:

Cascade/multihop

Triangle/multitarget

83 Module 12: Remote Replication

Page 71: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Three-site Replication: Cascade/Multihop

84

Bunker Site Remote Site

Local Replica Remote Replica

Source Device

Synchronous

Remote Replica

Disk Buffered

Source Site

Bunker Site Remote Site

Remote Replica

Source Device

Synchronous

Remote Replica

Asynchronous

Source Site

• Synchronous + Disk Buffered

• Synchronous + Asynchronous

Module 12: Remote Replication

Page 72: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved.

Three-site Replication: Triangle/Multitarget

85

Asynchronous

with

Differential

Resynchronization

Source Site

Bunker Site

Remote Site

Source Device

Remote Replica

Remote Replica

Module 12: Remote Replication

Page 73: ECE590-03 Enterprise Storage Architecture Fall 2016people.duke.edu/~tkb13/courses/ece590-2017fa/slides/11-dr.pdf · Designing and Developing Implementing Training, Testing, Assessing,

87

Summary

• Disaster Recovery (DR) exists to handle cases where High Availability (HA) redundancy is overwhelmed

• For data, the key is backups; for compute, it’s secondary compute servers

• Backup isn’t just mirroring! Rules: 1. Record changes to data over time

2. Have a copy at a separate physical location

3. Must be automatic

4. Require separate credentials to access

5. Be unwritable by anyone except the backup software (which ideally should live in the restricted backup environment)

6. Reliably report on progress and alert on failure

7. Have periodic recovery tests to ensure the right data is being captured

• Can do replication locally (for low cost, low RTO/RPO) and/or remotely (true DR, RTO/RPO proportional to cost)