st-3 laz-mills-forsythe-symantec-protect your business...peter laz –managing consultant –...
TRANSCRIPT
Peter Laz – Managing Consultant – Forsythe
Dylon Mills – Sr. Product Specialist – Symantec
PROTECT YOUR BUSINESS
2
AGENDA• Introductions
• Target Capability and 5 risks to success
• Closing the Gap – 6 steps to higher availability
• The Solution
• Q & A
3
AUDIENCE POLL
4
• Operational Resiliency (OR)
o Production H/A type solution
o Application level failover to Alternate Site or within primary site
o Can be used in response to an operational incident and/or application outage event
• Disaster Recovery (DR)
o Used in response to a full sitebased DR event
o Assumes use of both OR & DR solution to facilitate a full site recovery
TARGET CAPABILITY
Tier RTO / RPO* Description
0Active/Active(0 RTO/RPO)
Core IT Infrastructure, Network, Security, and Strategic Critical Applications
115min-4hr = RTO0-15min = RPO
Strategic Critical Applications unable to support Active/Active
2 4-12hrs / 15min-1hr Apps tiered by business criticality
3 24 hrs Apps tiered by business criticality
4 72 hrs Apps tiered by business criticality
5 14 days Apps tiered by business criticality
RPO = 24hrs unless stated otherwise**Not Indented to communicate Application Availability (SLA)
5
KEY CONCERNS
SystemAvailability
• Unexpected outages
• Delays in restoring service from planned or unplanned outages
• Failure to meet RTO goals
DataProtection
• Irretrievable loss of critical data
• Direct customer and legal implications
SystemPerformance
• Degraded performance
• Customer dissatisfaction
• Inefficient use/allocation of resources
6
TYPICAL SET UP – CURRENT SITUATION
Primary Data Center
DEPLOY ADDRESSING
• Clustering / HA
• Virtualization
• Replication
• Multi-pathing
• Servers
• Networks
• Storage
• Data
SecondaryData Center
7
DR VALIDATION EXERCISES
AND….. APT TO MISS CRITICAL FLAWS!
DR TESTS ARE:Resource IntensiveInfrequent + Challenging +
8
PRIMARY SITE
RISK 1: REPLICATION INCONSISTENCIES
DATABASE
GROUP A
GROUP B
DR SITE
Storage volumes from different consistency groups used by same
database or file system
In case of rolling disaster RDFs will get out of sync
causing irreversible database corruption and
slowing recovery
9
PRIMARY SITE
RISK 2: MISSING NETWORK RESOURCES
DR SITE
Remote (DR) host misconfigured to access
primary file system
Data loss in case of loss of primary;
Delays in recovering
10
PRIMARY SITE
RISK 3: TAMPERING RISK
DR SITE
Unauthorized host at DR site mis-configured with
access to storage
Recovery delays; Potential data corruption
Authorized host
Un-authorized host
11
PRIMARY SITE DR SITE
RISK 4: POINT IN TIME COPIES NOT TESTED
FILESYSTEM
Corruption in point in time copies not detected
Data loss; Increased time to recover
12
PRIMARY SITE DR SITE
RISK 5: INSUFFICIENT DR RESOURCES
Tendency to allocate less resources to DR than
primary
Inability to take up production load;
Extended recovery time
13
Ever-increasing risk
• Each configuration change in the infrastructure introduces risk
CONFIGURATION DRIFT –THE INSIDIOUS THREAT
RISK
CHANGE OVER TIME
CHANGE
EVENT
CHANGE
EVENT
14
PRODUCTION SITE DR SITE
RISK 6: PROD / DR CONFIGURATION DRIFTS
Difficult to ensure full synchronization at all timesUndetected due to limited
load for
Inability to take up production load;
Increased time to recover
Hardware8 x CPU 2.2Ghz
32 GB RAM2 x HBA2 x NIC
SoftwareOS: HP-UX 11.31
WebSphereJava 1.5
EMC PowerPath 4.4
Kernel ParametersMax up processes: 8192
Max # of semaphores: 600
Hardware2 x CPU 2.2Ghz
8 GB RAM1 x HBA1 x NIC
SoftwareOS: HP-UX 11.23No WebSphere
Java 1.4.2EMC PowerPath 3.0.5
Kernel ParametersMax up processes: 1024
Max # of semaphores: 128
15
CLOSING THE GAPSIX STEPS TO HIGHERSERVICE AVAILABILITY
16
SIX STEPS TO HIGHER AVAILABILITY • Automated• Business service and IT system perspective• Identify those most at risk
• Disciplined, closed loop verification• Tracked in system
• Cross-domain virtual team• Regular meetings• Agreed, practiced responses
• Automated• Format of choice• Integrated with ticketing system
• Understand impacts• Prioritize accordingly• React based on priorities
• Automated• Frequent• Non-intrusive
DETECT
ANTICIPATE
ALERT
COLLABORATE
VALIDATE
MEASURE
17
• Automation
• Broad scope
• Current risk signatures
• Agentless
PROACTIVE MONITORING AND MANAGEMENT
KEYS TO AVAILABILITY, PROTECTION & PERFORMANCE ……
18
DISASTER RECOVERY ADVISOR
SCAN DOCUMENT DETECT RISK & ALERT
• Schedule scans to meet business needs
• Collects data agentlesslyand unobtrusively
• Integrate with leading enterprise CMDBs
• Automatic, always current documentation
• Topology and textual view
• Clear visibility into RPO, RTO and other service availability metrics (actual vs. planned)
• >5,000 risk signatures
• Risk tracking & measurement
• Actionable alerts
• Seamless integration with existing tools
19
DISASTER RECOVERY ADVISOR
20
TICKET VIEW
21
TOPOLOGY VIEW
22
HOST COMPARISON
23
Key Functionality
• Automated verification of High Availability (HA), Cloud and DR systems
• Makes sure you meet important vendor best practices
• Clear visibility to risks and their business impact
Key Benefits
• Reduce critical data loss and downtime risks
• Reduce reliance on DR testing
• Optimize production and HA performance
• Qualitative measurement of Service Availability readiness
• Effective cross-domain teamwork
SUMMARY
DRA HELPS ASSURE HIGHER IT SERVICE AVAILABILITY
24
CONCLUSION:
24
• Manage availability, protection and performance by keeping a finger on the pulse of your data center
25
Peter [email protected]
Dylon [email protected]
26
DRA ARCHITECTURE
Role Based Web Interface
Windows 2008 R2 Server x64
Oracle 11g
(local or remote)
Optional Collectors
Storage Arrays • EMC: SYMCLI / NaviCLI• HDS/HP XP: HiCommand /
CommandView API• IBM: DSCLI / XCLI• NetApp: ZAPI / SSH / Telnet
Servers • SSH / WinRM / WMI • OS / Vendor read-only
commands and queries• Valid user credentials / keys
Databases • JDBC using read-only user credentials
DB2
Virtual Machines • VMware API using read-only user credentials
• AIX VIO: HMC CLI
• UNIX: OS commands