key considerations for deduplication in the enterprise
DESCRIPTION
Quantum's Mark Galpin discusses deduplication solutions in the Enterprise space.TRANSCRIPT
Integrating Deduplication intothe EnterpriseIs your Deduplication ready for the Enterprise ?
Mark GalpinProduct Marketing Manager – Disk systems
Agenda – The 6 Key Requirements
Deduplication level-set• What’s
the method for data reduction?
Why dedupe effectiveness is important• Using the
correct type?
Questions to ask about performance• More
than just ingest!
The impact of dedupe on cost• Have you
considered traditional?
Advantages for Disaster Recovery• Have you
looked at the entire data footprint?
Management and Reporting• Proactiv
e or reactive?
|2
Deduplication is:– Data reduction system for finding redundant data at a sub-file level– It recognizes patterns that repeat in different locations– It replaces repeated patterns with a reference to existing elements– It creates a pool of data elements used multiple times
Deduplication is not:– File level single-instance storage– Snapshots– Compression—although it uses compression
It can be implemented in:– Dedicated appliances– Hardware/software for integration– Existing software applications
What Dedupe Is, and Is Not
|3
Dataset 1
Dataset 2
Dataset 3
Deduplication level-set• What’s the
method for data reduction?
Dedupe Impact on Capacity Needs
Number of Backup Jobs
To
tal
GB
Sto
red
1 2 3 4 5 6 7 8 9 10
1000
900
800
700
600
500
400
300
200
100
0 Capacity used by DXi-Series appliance
Capacity used by conventional disk backup product
Savin
gs
The more data you retain the greater the savings!
|4
Deduplication Effectiveness Matters
Byte level or block level dedupe? Both can provide similar levels of reduction, depending on implementation
|5
Backup 1
Backup 2
Saved for comparison
Compared—differences kept
Byte-level Dedupe
Requires dedicating space Usually deployed as post process
Backup 1
Backup 2
Blocks, signatures, index created
Blocks compared, new ones stored
Block-level Dedupe
No additional space required Suitable for in-line or post processing
R G L O B A Y G B Y
OR G Y B P O G B Y R O
Why dedupe effectiveness is important• Using the
correct type?
Blocks Can Be Fixed- or Variable-Length
What’s the difference? A B C D
E F G H
A B C D
E B C D
Fixed length blocks:
Variable length blocksChange one, change ONE 5 unique blocks
Change one, change all 8 unique blocks
Common in backup software
approaches
Common in dedicated appliances
6
Fixed vs Variable Blocks for User Files
Dedupe rate slowing sequentially with each backup
Significantly better results with continuing aggressive trending based on variable block technology
BU#1 126% difference in reduction
BU#2 110% difference in reduction
BU#3 119% difference in reduction
BU#4 135% difference in reduction
BU#5 137% difference in reduction
BU#6 153% difference in reduction
Fixed Block
7
Fixed vs Variable Blocks for Exchange
Steady rates around 2% with negative Trending
Steady growth in dedupe rate as back-ups scale based on variable block technology
BU#1 17% difference in reduction
BU#2 46% difference in reduction
BU#3 107% difference in reduction
BU#4 88% difference in reduction
BU#5 145% difference in reduction
BU#6 191% difference in reduction
Fixed Block
8
What Are Implications of the Difference?
Difference is cumulative
Effects increase over time
Replication timing and bandwidth impacted
Data protected by blockpool will differ
Space reclamation will differ
Backup may take longer
|9
37% Faster
Backup Time
Variable/Fixed
Caveat!!Always look at total deduplication effect over
multiple jobs.Some systems only report the last event.
Questions to Ask about Performance
When you say “performance”, what do you mean? – Ingest, dedupe, compression, replication, read, space reclaim?– ALL may be important– Ask your vendor to show how all the parts come together
What do we all cite as a single number?
|10
How much can I get done in a day?
Questions to ask about performance• More than just
ingest!
Terms
Ingest: data comes inDeduplication: find duplicate data and store newCompression: dedupe almost always compresses– In some systems, it is a separate process with performance implications
Replication: send new blocks over networkRead: reading data to create tape or restore filesReclaim space: everybody has to do it
For any Enterprise dedupe system, ask the vendor to show you how ALL of these will stack up.
|11
What Gets Done in a Day
|12
ID,CRepReadRec
Typical Inline Style Flow
ID,CRepReadRec
Typical Full Post Process Flow
ID,CRepReadRec
Typical Full Post Process FlowDedupe Half as Effective
24 hrs
- Even if ingest and dedupe are the same, less can be protected- Effects are compounded for multi-purpose hardware.
- You can’t compensate just by adding more disk.
What about Cost?
How do the costs really compare for traditional backup and deduplication systems?
Have you considered a mixed solution?
|13
The impact of dedupe on cost• Have you
considered traditional?
Media Server
Tape Library
DXi 6700
Planning Tools & Services Can Help
Amount of Primary Data
to Protect
Backup Application
Backup Method (Fulls, Full-Incrementals, etc)
Backup Window Available
Annual Growth Rate
Change Rate between Backups
Restore Pattern (SLAs, restore window, age of files
restored, etc.)
DR method (off site tape, backup replication, etc.)
Site 1Application:Application:Application:Application:Application:Application:
Site 2Application:Application:Application:Application:Application:Application:
Site 3Application:Application:Application:Application:Application:Application:
Site 4Application:Application:Application:Application:Application:Application:
0.0
0.5
1.0
1.5
2.0
2.5
Backup 1 Backup 6 Backup 11 Backup 16
TB
Sto
red
0
2
4
6
8
10
12
Ded
up
lica
tio
n R
atio
Cumulative Protected TB Calculated Cumulative Unique TB
Calculated De-dup Real De-dup
|14
Dedupe Impact on DR Preparedness
Makes backup sets suitable for DR
Copies backup datasets by transmitting only net-new blocks
Allows standard WANs to act as DR network
Data deduplication decreases replication WAN bandwidth demand 50 times or more by eliminating the need to send redundant blocks
|15
Advantages for Disaster Recovery• Have you
looked at the entire data footprint?
Impact on Using Deduplication for DR Protection
How is DR protection being handled now?– Replicating primary data– Hot sites– Removable media
Data deduplication adds an important new option: replicating backup data over IP networks
– Can supply first tier DR protection, replacing transfer of media
Simplifies, reduces admin time, reduces ongoing costs– Comparison needs to look at several elements
Example of deduplication disk and replication– Use disk/dedupe for backup/restore
– Replication between dedupe appliances for tier-1 DR
– Tape for long term retention
|16
Quantum Vision 4.0 Central console for global management of disk and tape
© 2010 Quantum Corporation
Centralized Management
Division DR Facility
Primary Data Center
Remote
Single, at-a-glance view of global backup resourcesSingle point of management for disk, tape, replicationProactive monitoring, alerting, and reporting in graphical form
Consolidates analysis with custom reports– Capacity utilization– Media pool analysis
17
Management and Reporting• Proactive or
reactive?
Monitor for Steady State– Quickly identify anomalies
Measure dedupe & compression
Measure input rate & TB ingested– Drill down for individual BU rate-of-change
& capacity consumption– Confirm sizing estimates
Observe capacity consumption trends & determine upgrade timing
– Deduplication pool TB
… and much more to optimize performance and usage
– cpu load, disk activity, replication, network latency, space reclamation, etc … retains up to 6 year history
Quantum DXi Advanced Reporting (included)Provides Detailed DXi Views & Trend Analysis
18
Quantum is Uniquely Positioned to Help
Professional Services & Global Support
Shared File System VTL, NAS
Replication
Tiered Storage Policies
Data deduplication
Disk Tape Key Management
Proactive Diagnostics
Encryption
Centralized Management
StorNext Software Disk Backup & Dedupe
Tape Automation Management Tools
|19
The DXi Appliance: Protection for Across the Organization with Deduplication
5 10 15 20 25 30 35 *** 100+
DXi4500 Optimized for SMB & ROBO
DXi8500 For demanding data centers - NAS, OST & VTL
Customer Primary Data to Protect (TB)
DXi6000 Models Optimized for midrange IP or FC environments
© 2010 Quantum Corporation
DXi6500:Deduplication Product of the Year for 2010
NewNov2010
20
21
Deduplication For the Enterprise:
DXi8500 Enterprise: High performance backup for the Enterprise
Industry-leading performance--Integrates disk with tape and
replication for consolidated backup, DR, and long term retention
Industry-leading performance– Up to 6.4 TB/hour ingest--up to 5.4TB/hour read– Extensible platform leveraging latest technology– 8 Gb FC, 10 GbE, 6-core Nehalem processors
Enterprise scale– 20 to 200TB usable capacity– Scales easily on site
Anchors a multi-site, multi-tier strategy– VTL, NAS, OST interface– Replication compatible with all DXi systems– Integrated tape creation under leading applications,
including OSTSuperior total system
– Simplified licensing for easy deployment, growth– All software licenses included in base price– Advanced management for more control, less overhead
Questions?
22