data de-duplication vmug dallas

22
Confidential 1 Data De-Duplication VMUG Dallas March 26, 2008 Kyle Green Director, South Central U.S. 972-768-4896 [email protected]

Upload: akamu

Post on 15-Jan-2016

54 views

Category:

Documents


0 download

DESCRIPTION

Data De-Duplication VMUG Dallas. March 26, 2008. Kyle Green Director, South Central U.S. 972-768-4896 [email protected]. Where are we?. Where are we?. LZ Compression ~2x White space reduction. Storage Array 1:1. Single Instance Storage ~5x File level. Fixed Block ~8x - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data De-Duplication VMUG Dallas

Confidential1

Data De-DuplicationVMUG Dallas

March 26, 2008

Kyle GreenDirector, South Central [email protected]

Page 2: Data De-Duplication VMUG Dallas

Confidential2

Where are we?

Page 3: Data De-Duplication VMUG Dallas

Confidential3

Where are we?

Page 4: Data De-Duplication VMUG Dallas

Confidential4

Storage Array 1:1

LZ Compression ~2x

White space reduction

Single Instance Storage ~5x

File level

Fixed Block ~8x

Fixed blocks, snapshots

Data Deduplication Significantly Reduces• Power• Heat• Cooling• Management

Data Deduplication Significantly Reduces• Power• Heat• Cooling• Management

Hierarchy of Data Reduction Types

Data Deduplication

~20x

Page 5: Data De-Duplication VMUG Dallas

Confidential5

Deduplication Storage Systems > 3,700 systems installed > 1,500 customers > 325 petabytes under Data Domain protection worldwide

A History of Industry Firsts

Data Domain: Leadership and Innovation

First Dedupe NASFirst Dedupe NAS

First Dedupe Volume ReplicationFirst Dedupe Volume Replication

First Dedupe GatewayFirst Dedupe Gateway Largest Dedupe ArrayLargest Dedupe Array

First DedupeDirectory ReplicationFirst DedupeDirectory Replication

First Dedupe VTLFirst Dedupe VTL

2003 2004 2005 2006 2007

First Dedupe Nearline StorageFirst Dedupe Nearline Storage

Page 6: Data De-Duplication VMUG Dallas

Confidential7

Storage 3.0 - The Long Term Play

Storage1.0 PRIMARY TAPE

Storage2.0 PRIMARY

SATA & RAID TAPE

Storage3.0

PR

IMA

RY

Deduplicated Storage

TA

PE

Page 7: Data De-Duplication VMUG Dallas

Confidential8

Key Attributes of Data Domain Technology

Easily Integrates with Existing Infrastructure

Retention: Deduplication

Recovery: Data Invulnerability Architecture

Replication: WAN Efficient

Data Domain Deduplication Storage for Nearline Applications

Page 8: Data De-Duplication VMUG Dallas

Confidential9

Today’s Data Protection Challenges

Challenges Massive data growth Economic pressures Regulatory compliance Challenges with tape

• Questionable reliability• Mechanical failures• DR via trucks• Longer recovery times

The Solution

Page 9: Data De-Duplication VMUG Dallas

Confidential10

Easily Integrates with Existing Infrastructure

3U(15) 500 GB SATA drives

RAID-6NVRAMN+1 Fan

1 - 4 Ports5.4 to 21.6 TB with Shelves

File System

(Gateway to: EMC, HDS, Nexsan, Pillar, NetApp, 3PAR)

CIFS, NFS, NDMP

Ethernet

FC = VTL

Replication

No rip and replace.

plus other nearline

applications

Page 10: Data De-Duplication VMUG Dallas

Confidential11

Second Friday Full BackupSecond Friday Full Backup

B C D E F L G H

Data Deduplication: Under the Hood

A B C D E F G H I J

Friday Full BackupFriday Full Backup

A B C D A E F G

Mon IncrMon Incr A B H

Tues IncrTues Incr C B I

Thurs IncrThurs Incr A C K

Weds IncrWeds Incr E G J

BACKUP DATA LOGICAL ESTIMATED PHYSICALREDUCTION

Monday Incr 100 GB 7-10x 10 GB

Tuesday Incr 100 GB 7-10x 10 GB

K L

Wednesday Incr 100 GB 7-10x 10 GB

Thursday Incr 100 GB 7-10x 10 GB

2nd FRIDAY FULL 1 TB 50-60x 18 GB

TOTAL 2.4 TB 7.8x 308 GB

FRIDAY FULL 1 TB 2- 4x 250 GB

Store more backups in a smaller footprint.

Page 11: Data De-Duplication VMUG Dallas

Confidential12

Longer Retention: Store More with Less

Week 1Week 1

BACKUP DATA LOGICAL ESTIMATED PHYSICALREDUCTION

April 14 3.8 TB 10x 366 GB

April 21 5.2 TB 12x 424 GB

April 28 6.6 TB 14x 482 GB

May 31 12.2 TB 17x 714 GB

June 30 17.8 TB 19x 946 GB

TOTAL 23.4 TB 20x 1178 GB

April 7 2.4 TB 8x 308 GB

Over 1 year of retention in 3µ of Data Domain protection storage.

Week 2Week 2

Week 3Week 3

Month 1Month 1

Month 2Month 2

Month 3Month 3

Month 4Month 4 July 31 23.4 TB 20x 1178 GB

Page 12: Data De-Duplication VMUG Dallas

Confidential13

Inline Deduplication for Optimized Time-to-DR

Post-process DR restore point is usually obsolete

Replicate During Backup

DR-ReadyData DomainInline Dedupe/

Replication

Backup to Cache Dedupe & Replicate DR Ready

Post-ProcessDedupe

VTL/Tape/Truck Backup to VTL Copy to Tape Truck to DR Site

DR-Ready

Backup WindowAdditional 2-3x backup time

to get to DR Ready

Page 13: Data De-Duplication VMUG Dallas

Confidential14

In Line vs Post Process

5 TAddressable

5 TAddressable

5 TB Initial Full Backup @ 2:1 Deduplicated inline @ 60MB/s – 2.5T written

Initial Full Cached Data

5 TB Initial Full Backup @ 2:1 Deduplicated Post Process @ 30MB/s – 2.5T cached to disk while 2.5T deduped to 1.25T

Deduped Data

500 GB Incremental Backup @ 7:1 Deduplicated inline @ 60MB/s – 71 GB written Daily. 426 GB Total (6 Days Inc) 2.926 TB Total written to the system

500 GB Incremental Backup @ 7:1 Deduplicated Post Process @ 30MB/s – 250G Cached to disk while 250 deduplicated to 36GB. Remaining deduped after backup 2.926 T Total Written

5 TAddressableInitial Full

5 TAddressableInitial Full

5 TAddressableInitial Full

5 TB Subsequent Full Backup @ 50:1 Deduplicated inline @ 60MB/s – 100GB written. 3.026 TB Total written to System.

5 TAddressableInitial Full

5 TB Subsequent Full Backup @ 50:1 Deduplicated Post Process @ 30MB/s – 2.5T cached to disk while 2.5T deduped to 50 GB – OUT OF SPACE

2.5T Needed. 2.0 t Avail

After 1 week retention a 5 TB post processing system is out of space for caching. All backups must slow to accommodate incoming data without caching.

2.074 TB Remaining

1.25T rem.2.5TB Remaining

2.074 TB Remaining

1.974TB Remaining

Page 14: Data De-Duplication VMUG Dallas

Confidential15

Recovery: Data Invulnerability Architecture

Other RAID-6 NVRAM Snapshots

Data Verification CheckSum Dedupe, write to disk Verify

Self-healing file system Cleaning Expired data Defrag Verify

Trust but verify – hope is not a strategy.

Page 15: Data De-Duplication VMUG Dallas

Confidential16

Replication: WAN Efficient

WAN

home

Backup Data

Backup DataBackup

Data

home

DIR A

Source: Remote Sites

Destination: Data Center Hub

95- 99% Bandwidth Reduction95- 99% Bandwidth Reduction

1- 5%

1- 5%

1- 5%

True DR; lowers WAN costs; improves SLAs.

Archive Data

Backup Data

Page 16: Data De-Duplication VMUG Dallas

Confidential18

So … How does this work with VMware?

Page 17: Data De-Duplication VMUG Dallas

Confidential19

Backing Up VMware to Data Domain

Page 18: Data De-Duplication VMUG Dallas

Confidential22

“…is he still talking..?” - Summary Concepts

Data Domain enables NAS, (CIFS, NFS) NDMP & VTL backup targets for all virtualized applications

• Drops into existing enterprise backup architectures• Works with Virtualized and Non-Virtualized environments• In 80/20 data centers, centralized capacity optimization provides single

instance store across all applications and systems, virtual or actual

Back-up VMs to DDR with agent, or service console level• Choose to place an agent on critical VMs for file level restore• Choose to place an agent on the service console as well• Back-up all to same DDRs and watch compression happen

Consolidated back-ups sent from proxy to DDR• If you prefer an agent free virtual machine…

Global Rule: all data is compared to all other data in the DDR

Replicate all or some to anywhere, whenever, and back • DR, test, development, virtual application migration

Page 19: Data De-Duplication VMUG Dallas

Confidential24

Clients Server Primarystorage

Backup/mediaserver

OnsiteRetentionStorage

Offsite Disaster

Recovery Storage

Retention/Restore Replication DRBackup

Archive to tapeAs required

WAN

Data Domain: Dedupe Simplified

High-speed, inline deduplication storage; disk target for nearline applications Any leading backup software, archive apps, or custom nearline use All data types: structured and file Any fabric: NFS / CIFS / NDMP via Ethernet, or VTL via Fibre Channel Disk storage: Internal, or gateway to SAN array One dedupe infrastructure: remote office, datacenter with inline replication

OnsiteRetentionStorage

Offsite Disaster

Recovery Storage

Data Domain

Archive

Archive Application

Server

‘Drag&Drop’ Archiving

Page 20: Data De-Duplication VMUG Dallas

Confidential25

Summary: Key Attributes

Easily Integrates with Existing Infrastructure No rip/replace

Retention: Deduplication for Nearline Applications Store more backups and archived data in smaller footprint

Recovery: Data Invulnerability Architecture Trust but verify – hope is not a strategy

Replication: WAN efficient True DR Lowers cost of WAN Improves SLAs

Page 21: Data De-Duplication VMUG Dallas

Confidential26

Summary: Simplifying Deduplication Storage

Lower TCO Much lower cost for disk-based retention Lower operational costs, smaller foot print Neutral to price of tape automation Low bandwidth for replication, DR

Faster Handles variable streams smoothly, unlike tape Better SLAs: Random access to restores and archives

Secure Designed as store of last resort No tapes on a truck

Simple Set it and forget it Any backup or archive software, any storage fabric, all data types

Page 22: Data De-Duplication VMUG Dallas

Confidential27

Thank You