sto2063bu architecting site recovery manager to or distribution · 2019-06-27 · gs khalsa, vmware...
TRANSCRIPT
GS Khalsa, VMware@gurusimran
STO2063BU
#VMworld #STO2063BU
Architecting Site Recovery Manager to Meet Your Recovery Goals
VMworld 2017 Content: Not fo
r publication or distri
bution
• This presentation may contain product features that are currently under development.
• This overview of new technology represents no commitment from VMware to deliver these features in any generally available product.
• Features are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.
• Technical feasibility and market demand will affect final delivery.
• Pricing and packaging for any new technologies or features discussed or presented have not been determined.
Disclaimer
#STO2063BU CONFIDENTIAL 2
VMworld 2017 Content: Not fo
r publication or distri
bution
Agenda
3
1 Protection Groups and Recovery Plans
2 Topologies
3 Impacts to RTO
4 Recommendations
5 Demos
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Terminology
4
RPO - Recovery
Point Objective
RTO - Recovery
Time Objective
Last Viable
Restore Point
All Functionality
Recovered
Disaster Strikes
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Protection Groups and Recovery Plans
VMworld 2017 Content: Not fo
r publication or distri
bution
What Is a Protection Group?
• Group of VMs that will be recovered together
– Application
– Department
– System type
– ?
• Different depending on replication type
• A VM can only belong to one Protection Group
6
Protection
Group
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
How Do Protection Groups Fit into Recovery Plans
#TAM4542U CONFIDENTIAL 7
Protection Group 1 – Web App
Protection Group 2 – Email
Protection Group 3 – SharePoint
Protection Group 1 – Web App
Protection Group 2 – Email
Protection Group 3 – SharePoint
Recovery Plan 2 - Email
Protection Group 2 – Email
Recovery Plan 3 – Whole Site
Recovery Plan 1 – Web App
Protection Group 1 – Web App
VMworld 2017 Content: Not fo
r publication or distri
bution
vSphere Replication Protection Groups
• Group VMs as desired into Protection Groups
• What storage they are located on doesn’t matter
8
Protection Group 1 – Web App Protection Group 2 – Email
Protection Group 3 – SharePoint
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Array Based Protection Groups
#TAM4542U CONFIDENTIAL 9
Consistency Group Protection Group 1 – Web AppLUN 1
Protection Group 2 – Email
Protection Group 3 – SharePoint
Datastore A
LUN 2
Datastore B
LUN 3
Datastore C
LUN 4
Datastore D
LUN 5
Datastore F
VMworld 2017 Content: Not fo
r publication or distri
bution
Policy Driven Protection
10
Profile Driven
Protection Group
• New Style Protection Group leveraging storage profiles
• High level of automation compared to traditional protection groups
• Policy based approach reduces OpEx
• Simpler integration of VM provisioning, migration, and decommissioning
Storage Policy
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
How Should You Organize Your Protection Groups?
• More Protection Groups
– Higher RTO
– Easier testing
– Only what is needed
– More granular & complex
• Fewer Protection Groups
– Lower RTO
– Less granular, complex & flexibile
#TAM4542U CONFIDENTIAL 12
Fewer LUNs/PGs
Less complexity
Less flexibility
Lower RTO
More LUNs/PGs
More complexity
More flexibility
Higher RTO
Right combination
of complexity and
flexibilityVaries by customer
VMworld 2017 Content: Not fo
r publication or distri
bution
Topologies
VMworld 2017 Content: Not fo
r publication or distri
bution
SRM Supports Multiple DR Topologies
Active-Passive
Failover
Active-Active
Failover
Bi-directional
Failover
Multi-site
Recovery
Production
Recovery
Production
Production
• Dedicated resources
for recovery
• Run low-priority apps
on recovery
infrastructure
• Production applications
at both sites
• Each site acts as the
recovery site for the
other
• Many-to-one failover
• Useful for Remote Office
/ Branch Office
Production
#TAM4542U CONFIDENTIAL 14
VMworld 2017 Content: Not fo
r publication or distri
bution
Enhanced Topology Support
• Shared recovery site and shared protected site support
#TAM4542U CONFIDENTIAL 16
SRM VC
Remote Office A
SRMVC
Main Data Center
SRM
SRM
SRM VC
Remote Office B
SRM VC
Remote Office C
VMworld 2017 Content: Not fo
r publication or distri
bution
Enhanced Topology Support
Remote Office A
Remote Office B
Remote Office B
SRMVC
Shared DR Site
Remote
Office
SRM
Remote
Office
VC
#TAM4542U CONFIDENTIAL 17
VMworld 2017 Content: Not fo
r publication or distri
bution
Enhanced Topology Support
SRMVC
Site B
SRM SRMVC
Site C
SRM
SRMVC
Site A
SRM
#TAM4542U CONFIDENTIAL 18
VMworld 2017 Content: Not fo
r publication or distri
bution
SRM & Stretched Storage
19
New York
Stretched Storage
New Jersey
vSpherevSphere
• Best of both
• Zero downtime with Orchestrated cross-VC vMotion
• Non-disruptive testing
SRMvCenter SRMvCenter
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
vCenter
SRM & vSAN Stretched Cluster
20
Today
vSAN Stretched Cluster
Witness Appliance
vESXi
SRM
vSANvSphere Replication
any distance, 5 min RPO
SRM
vCenter
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Enhanced Linked Mode
21
VC
Site B
SRM
PSC PSC
Site A
VC
SRM
Single SSO domain
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Impacts to RTO
VMworld 2017 Content: Not fo
r publication or distri
bution
Decision Time
• How long does it take to decide to failover?
23
Recovery Time Objective
All
Functionality
RecoveredDisaster
Strikes
How long does it take to decide to
initiate your DR plan?#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
IP Customization
• Workflow without IP customization
– Power On VM and wait for VMtoolsheartbeats
• Workflow with IP customization
– Power On VM with network disconnected
– Customize IP utilizing VMtools
– Power Off VM
– Power On VM and wait for VMtoolsheartbeats
• Alternatives
– Stretched Layer-2
– Move VLAN/Subnet
24#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Priorities and Dependencies
#TAM4542U CONFIDENTIAL 25
Group 5Group 4Group 3Group 2Group 1
WebServer2
2 mins
WebServer1
2 mins
AppServer 2
3 mins
Database 1
5 mins
AppServer 1
3 mins
Database 2
5 mins
Group 1
10 mins
Group 2
10 mins
Total Time
20 mins
VMworld 2017 Content: Not fo
r publication or distri
bution
Priorities Only
#TAM4542U CONFIDENTIAL 26
Group 5Group 4Group 3Group 2Group 1
WebServer2
2 mins
WebServer1
2 mins
AppServer 2
3 mins
Database 1
5 mins
AppServer 1
3 mins
Database 2
5 mins
Group 1
5 mins
Group 2
3 mins
Group 3
2 mins
Total Time
10 mins
VMworld 2017 Content: Not fo
r publication or distri
bution
Organization for Lower RTO
• Fewer/larger NFS Datastores/LUNs
• Fewer Protection Groups
• Don’t replicate VM swap files
• Fewer Recovery Plans
#TAM4542U CONFIDENTIAL 27
VMworld 2017 Content: Not fo
r publication or distri
bution
VM Configuration
• VMware Tools installed in all VMs
• Suspend VMs on Recovery vs. PowerOff VMs
• Array-Based Replication vs. vSphere Replication
28#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Recovery Site
• vCenter sizing – it works harder than you think
• Number of hosts – more is better
• Enable DRS – why wouldn’t you?
• Different recovery plans target different clusters
29#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Recommendations
VMworld 2017 Content: Not fo
r publication or distri
bution
Be Clear with the Business
• What is/are their:
– RPOs?
– RTOs?
– Cost of downtime?
– Application priorities?
– Units of failover
– Externalities?
• Do you have:
– Executive buy-in?
31
Recovery
Point
Objective
Recovery
Time
Objective
Last Viable
Restore Point
All Functionality
Recovered
Disaster Strikes
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Risk with Infrequent DR Plan Testing
– Parallel and cutover tests provide the best verification, but very resource intensive and time consuming.
– Cutover tests are disruptive, may take days to complete and leaves the business at risk
#TAM4542U CONFIDENTIAL 34
Unproven
Recoverability
TimeDR Test DR Test
TESTING GAP
Recovery
Risk
IT Environment without
Virtualization & DR Automation
VMworld 2017 Content: Not fo
r publication or distri
bution
Frequent DR Testing Reduces Risk
– Increased confidence that the plan will work
– Recovery can be tested at anytime without impact to production
#TAM4542U CONFIDENTIAL 35
Virtualization & DR Automation Greatly Reduce Recovery Risk
Recovery
Risk
DR Test
Frequent
DR Testing
Time
Virtualization + DR Automation
VMworld 2017 Content: Not fo
r publication or distri
bution
Test Network
– Use VLAN or isolated network for test environment
• Default “Auto” setting does not allow VM communication between hosts
– Different PortGroup can be specified in SRM for test vs actual run
• Specified in Network Mapping and/or Recovery Plan
#TAM4542U CONFIDENTIAL 36
VMworld 2017 Content: Not fo
r publication or distri
bution
Test Network – Multiple Options
37
vCenter B / SRM B
Pro
d_W
eb_V
130
Pro
d_W
eb_V
130
Pro
d_W
eb_V
120
Pro
d_W
eb_V
120
Pro
d_W
eb_V
110
Pro
d_W
eb_V
110
Universal Logical Switch
vCenter A / SRM A
Implicit Mapping
Implicit Mapping
Implicit Mapping
Primary
Test network
Test_
Web_V
230
Test_
Web_V
220
Test_
Web_V
210
Secondary
• Two Options• Disconnect NSX Uplink
(this can be easily scripted)• Use NSX to create
duplicate “Test” networks
#TAM4542U CONFIDENTIAL
VMworld 2017 Content: Not fo
r publication or distri
bution
Demos
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO2063BU CONFIDENTIAL 39
VMworld 2017 Content: Not fo
r publication or distri
bution
#STO2063BU CONFIDENTIAL 40
VMworld 2017 Content: Not fo
r publication or distri
bution
Recap
1 Protection Groups and Recovery Plans
2 Topologies
3 Impacts to RTO
4 Recommendations
5 Demos
#STO2063BU CONFIDENTIAL 41
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
VMworld 2017 Content: Not fo
r publication or distri
bution
Questions
VMworld 2017 Content: Not fo
r publication or distri
bution