mgt3342bes architecting data protection with rubrik or ...€¦ · architecting data protection...
TRANSCRIPT
Andrew MillerRebecca Fitzhugh
MGT3342BES
#VMworld #MGT3342BES
Architecting Data Protection with Rubrik
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
2#MGT3342BES CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Rebecca Fitzhugh
Tweet
Blogger
Co-Host
I have a job!
Author
VMware
@ rebeccafitzhugh
@ technicloud.com
@ vbrownbag.com
@ Rubrik.com
vSphere Virtual Machine Management
Learning VMware vSphere
VCDX #243VMworld 2017 Content: Not fo
r publication or distri
bution
Andrew Miller
Tweet
Blogger
TMM
Background
Certs
VMware
@ andriven
@ thinkmeta.net
@ Rubrik.com
7 years customer, 8 years partner.
Lots of Random Ones
vExpert (6x)
VMworld 2017 Content: Not fo
r publication or distri
bution
Agenda? Nah…
Share Data Protection Architecture Knowledge
(more than half)
Show Where Rubrik Fits Technically + Demo
(less than half)
Fair?
(Q&A Too)
VMworld 2017 Content: Not fo
r publication or distri
bution
Why bother? One big reason…
Business Expectations
Of
Disaster Recovery / Data Protection
IT Capabilities
For
Disaster Recovery / Data Protection
!=!=VMworld 2017 Content: Not fo
r publication or distri
bution
What Are You Really Protecting Yourself Against?
• Lost or postponed sales and income
• Regulatory fines
• Delay of new business plans
• Loss of contractual bonuses
• Customer dissatisfaction
• Timing and duration of disruption
• Increased expenses such as overtime labor and outsourcing
• Employee Burnout
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Disaster?
Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.
• But what does that look like IN OUR
ENVIRONMENT?
• What disaster and recovery scenarios
should we plan for?
VMworld 2017 Content: Not fo
r publication or distri
bution
Sabotage!
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
12
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
13
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
14
VMworld 2017 Content: Not fo
r publication or distri
bution
Natural Disaster
15
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
16
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
17
VMworld 2017 Content: Not fo
r publication or distri
bution
Power Loss
18
VMworld 2017 Content: Not fo
r publication or distri
bution
What is the most common scenario for disaster?
19
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Disaster?
Disaster: An event that affects a service or system such that significant effort is required to restore the original performance level.
• But what does that look like IN OUR
ENVIRONMENT?
• What disaster and recovery scenarios
should we plan for?
• Where do we begin?
• How do we do it?
VMworld 2017 Content: Not fo
r publication or distri
bution
What is a Business Impact Analysis (BIA)?
• A process to understand:
– What is the monetary impact of a disaster or failure?
– What are the most time-critical and information-critical business processes?
– How does the business REALLY rely upon IT Service and Application availability?
– What availability or recoverability capabilities are justifiable based on these requirements, potential impact, and costs?
• Composed of two components
– Technical Discovery – Data Gathering
– Human Conversation – Talk to People!
VMworld 2017 Content: Not fo
r publication or distri
bution
Example Output – Priority Tiers
Priority Tier Description
Priority 1
High Availability /
Immediate Recovery
Services whose unavailability more than a brief period can have a severe impact
on customers or time-critical business operations.
Priority 2
1-2 day recovery
Services whose unavailability significantly impacts customers or business
operations.
Priority 3
3-5 day recovery
Services which can tolerate up to five days of disruption in a disaster.
Priority 4
6-10 day recovery
Services which can tolerate up to ten days of disruption in a disaster.
Priority 3 and 4 systems may be restored in less time, depending on the situation.
However, higher priority functions will be restored first.
Priority 5
“Best effort” recovery
Non-critical services which can tolerate two weeks or more of disruption in a
disaster. These systems will be restored on a best-effort basis, after other more
critical systems have been restored and ongoing operations have resumed.
Priority 5 systems may be restored in less time, depending on the situation.
However, higher priority functions will be restored first. In some cases, systems
deemed to not be required for continued operations may not be restored.
VMworld 2017 Content: Not fo
r publication or distri
bution
What is an SLA?
• A contract between an external service provider and its customers or between an IT department and the internal business units it serves.
23
VMworld 2017 Content: Not fo
r publication or distri
bution
What is an SLA?
• Two 9’s – 99% = 3.65 days of downtime per year (easy to achieve, less expensive)
• Three 9’s – 99.9% = 8.76 hours of downtime per year
• Four 9’s – 99.99% = 52.6 minutes of downtime per year
• Five 9’s – 99.999% = 5.26 minutes of downtime per year (difficult to achieve, expensive!)
24
VMworld 2017 Content: Not fo
r publication or distri
bution
DECLARE
DISASTER
10 a.m.
Recovery Point Objectives(RPO)
Recovery Time Objectives(RTO)
RPO: Amount of data lost from
failure, measured as the amount
of time from a disaster event
RTO: Targeted amount of time
to restart a business service
after a disaster event
5a.m.
6a.m.
7a.m.
8a.m.
9a.m.
10a.m.
11a.m.
12a.m.
1p.m.
2p.m.
3p.m.
4p.m.
5p.m.
6p.m.
7p.m.
Disaster Recovery: Key Measures
VMworld 2017 Content: Not fo
r publication or distri
bution
Cost
Disaster Recovery: Key Measures
Weeks Days Hours Minutes Seconds WeeksDaysHoursMinutesSeconds
Recovery Point Recovery Time
Real TimeVMworld 2017 Content: N
ot for publicatio
n or distribution
BC vs DR vs OR – Say What?
• Business Continuity
– All goes on as normal despite an incident
– Could lose a site and have no impact on business operations (active/active sites)
• Disaster Recovery
– To cope with & recover from an IT crisis that moves work to an alternative system in a non-routine way.
– A real “disaster” is large in scope and impact
– DR typically implies failure of the primary data center and recovery to an alternate site
• Operational Recovery
– Addresses more “routine” types of failures (server, network, storage, etc.)
– Events are smaller in scope and impact than a full disaster
– Typically implies recovering to alternate equipment within the primary data center
• Each should have its own clearly defined objectives – at minimum know the difference.
VMworld 2017 Content: Not fo
r publication or distri
bution
Where Rubrik HelpsLet’s keep it architecture focused.
28
VMworld 2017 Content: Not fo
r publication or distri
bution
29
Complexity is the Enemy
Whatever you do. Whatever you buy.
Simplify your Architecture & Expect MoreVMworld 2017 Content: N
ot for publicatio
n or distribution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
2. Speed of Data Recovery
30
VMworld 2017 Content: Not fo
r publication or distri
bution
31
Data Management: 1990s to Present
1990s – Present
Backup &
Replication
Software
Backup Storage
Backup
Software
Backup
Servers
Backup
Proxies
Replication Catalog
Database
Tape Off-site ArchiveBackup Storage
a
Dedupe
Metadata
2000s – Present
Data Management: 2000s to Present
VMworld 2017 Content: Not fo
r publication or distri
bution
In Two Words
Sad PandaVMworld 2017 Content: N
ot for publicatio
n or distribution
33
Meet Rubrik Cloud Data Management
Backup
Software
Backup
Servers
Backup
Proxies
Replication Catalog
Database
Tape Off-site ArchiveBackup Storage
a
Dedupe
MetadataPrivate Public
Software fabric for orchestrating apps and data across clouds. No forklift upgrades.
VMworld 2017 Content: Not fo
r publication or distri
bution
35
How It Works
Quick Start: Rack and go. Auto-discovery.
Rapid Ingest: Flash-optimized, parallel ingest accelerates snapshots and eliminates stun.
Content-aware dedupe. One global namespace.
Automate: Intelligent SLA policy engine for
effortless management.
Instant Recovery: Live Mount VMs & SQL.
Instant search and file restore.
Secure: End-to-end encryption. Immutability to
fight Ransomware.
Cloud: “CloudOut” instantly accessible with global
search. Launch apps with “CloudOn” for DR or
test/dev. Run apps in cloud.
Primary Environment
SLA Policy Engine
Log Management
Private Public
NAS
AHV Hyper-V
VMware VMwareVMware VMwareVMware VMware
VMworld 2017 Content: Not fo
r publication or distri
bution
36
Your Data Center Today
Backup Proxy
SAN
Production Servers
Backup Server
Search Server
Disk-Based
Backup
Tape Archive Offsite
Tape Vault
VMworld 2017 Content: Not fo
r publication or distri
bution
37
Rubrik Simplifies Your Data Center
SAN
Production Servers
Scale Out
Scale Out Rubrik
Replication + Long-Term
Retention + Search
Private
VMworld 2017 Content: Not fo
r publication or distri
bution
Data Management in the Cloud
38
On-Premises
Applications & Data
Storage
Azure Instance
Blob
Storage
Backup
Replication
Archival
Analytics
Rubrik
Cloud-Native
Applications & Data
EC2 Instance
Rubrik
VMworld 2017 Content: Not fo
r publication or distri
bution
39
Recovery Point Objective (RPO)Availability Duration (Retention)When to Archive (RTO)Replication Schedule (DR)
{SLA
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL40
Let’s Demo!
What does it look like?
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
41
VMworld 2017 Content: Not fo
r publication or distri
bution
Under the Hood
42
“The Interface”
“The Logic”
“The Core”
Distributed Task Framework
CallistoDistributed Metadata Service
Cluster Management
Global Search
CerebroData Management
CrystalUI / API
InfinityEcosystem
Integration
ThorCloud Connect
AtlasCloud-Scale File System
NFS
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
2. Speed of Data Recovery
a. Search + Live Mount
43
VMworld 2017 Content: Not fo
r publication or distri
bution
CONFIDENTIAL44
Let’s Demo!
What does it look like?
VMworld 2017 Content: Not fo
r publication or distri
bution
Rubrik Backup / Recovery + DR
45
SAN
Production Servers
Replication + Long-Term
Retention + Search
DR Servers
RubrikBackup S/W + Dedupe Storage
RubrikReplication & DR
Private
VMworld 2017 Content: Not fo
r publication or distri
bution
Key Evaluation Criteria
What we’ve seen that makes a difference…
1. Reliability of Data Recovery
a. Simplicity of Setup and Day 2 Operations – SLA Policies!
b. Immutability – is your data there there when you need it?
2. Speed of Data Recovery
a. Search + Live Mount
b. API Usage / Automation to enhance restore capabilities
46
VMworld 2017 Content: Not fo
r publication or distri
bution
Oh… By the Way
47
Your App
Use an API-first platform to create powerful automation workflows that can
be integrated with any service that supports outbound REST
Now OpenAPIVMworld 2017 Content: N
ot for publicatio
n or distribution
One More Demo!Wait a minute…we’ve been doing them already.
48
VMworld 2017 Content: Not fo
r publication or distri
bution
What did you see?
49
Easy Integration
with vSphere
Working with an
SLA Policy
Real-time Data
Search
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
51
Don’t Backup. Go Forward.VMworld 2017 Content: N
ot for publicatio
n or distribution
VMworld 2017 Content: Not fo
r publication or distri
bution
Andrew Miller | [email protected] | @andrivenRebecca Fitzhugh | [email protected] | @rebeccafitzhugh
VMworld 2017 Content: Not fo
r publication or distri
bution