business continuity sig updatewhp-aus2.cold.extweb.hp.com/pub/nonstop/ccc/apr1408.pdf · 2008. 4....
TRANSCRIPT
© 2008 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
Bob Loftis
HP NonStop Product Management
April 14, 2008
Business Continuity SIG Update
Future product plans, dates, and functionality are subject to change without notice
Business continuity software direction
• H1 2008 – Partner active/active support, ease of use− Add requested improvements, quality fixes for AutoTMF, AutoSYNC
− Ship new ODBC/MX driver on OSS for partner active/active replication solutions
• H2 2008 – More manageable replication− SQL DDL Replicator (SDR), September (for MP)
− RDF 1.9, October: manageability, availability, performance
− TMF 3.6, July back-port Neoview features, led by Transaction Typing
− AutoSYNC enhancements based on customer requests, November
− New APIs to encourage partners’ synchronous active/active replication
• 2009, Future – More cost effective solutions− Enhance SDR to support SQL MX
− Increased manageability, performance for TMF, RDF, SDR, AutoTMF, AutoSYNC
− Updates and new functionality for ODBC/MX on OSS
Future product plans, dates, and functionality are subject to change without notice
NonStop Remote Database Facility• High throughput and low CPU
utilization
• Exceptional multi-node transaction support
• Focused on data integrity (nodes in sync)
• Active/active split reciprocal
• Designed for easy installation and maintenance – but test!
• Oct 2008: 11 enhancements -availability, performance, manageability
Future product plans, dates, and functionality are subject to change without notice
RDF 1.9 Enhancement Details• Availability
− faster TMF dumps to target for additional DR protection− new guidelines for faster Switchover/Takeover configurations
• Performance− improved performance of shared access DDL: suspending updaters− faster access to updater-replicated data FASTUPDATEMODE
• Manageability− extra identifiers with longer process/volume names− issue Takeover in an obey file− see subvolume name in an Error Messages− access to a file level replicatepurge option− new option to purge any existing control file(s)− see the SQL MX table names− enter one command to affect many RDF/IMP(X) environments
NonStop TMF 3.6 July 2008
•Reliable transaction protection
•Enhancements− support different transaction types
− ANSI names for SQL MX objects
− manually abort (long running) transactions, with caution
− option for faster audit trail release• pin from first write position, not trxn
start time
− enable network transaction joins
Future product plans, dates, and functionality are subject to change without notice
Complementary Product Enhancements
AutoTMF - Update 8 April 2008• AutoCommit feature enabled by default (added safety net)
• safe supression of unneeded long running txn EMS warnings
• flexibility to do commands (COPY for ex) under user txn rather than command interpreter-generated txn
• option to get list of commands used to change global parameter default values
AutoSYNC - Update 10 May 2008• allow object files to be synchronized based on binder or linker
timestamp differences
• execute trigger commands after replication of entire file set
• produce list of commands used to change global parameter default values
NonStop SQL DDL Replicator (SDR)
• NonStop SDR captures, replicates and implements NS SQL/MP DDL operations not currently replicated by RDF
• Sample of DDL changes replicated:− all ALTER, COMMENT, CREATE, DROP operations
− all ALTER TABLE and ALTER INDEX (including spilt and merge partitions)
• Works with RDF to ensure data and DDL replication performed in sequence to guarantee logical consistency− no significant effect on performance or operation of applications on primary
− no changes to application programs required
− low latency in transmitting DDL operations replication to remote system
• Reasonably priced, very easy to use
HP supports active/active education
• RDF has a safe, limited form, as do partners
• Partners have a variety of asynchronous methods
• Understand practical requirements− Is your environment ready? (power, comms, sizing etc.)
−How much must you customize your app?
−What successful model will you follow?
−How will you handle collisions?
−How frequently will you switchover to test?
−What range of product + service costs can you accept?
Business Continuity Virtual SIG
April 14, 2008
GoldenGate Highlights
Company Strength
and Service
Rapid Growth in
Strategic Partners
450+ customers... 5000+ solutions implemented… in 35+ countriesEstablished, Loyal
Customer Base
Established
in 1995
Worldwide offices:USA, EMEA, Asia Pacific,
Latin America
Exceptional
customer support:24x7 global coverage
GoldenGate Customers Worldwide Running Active/Active
GoldenGate Everyday Example…
Backup System
Real-Time Access:
High Availability /
Disaster Recovery
Fraud
Detection
Data
Warehouse
Real-Time Information:
Business Intelligence / Data Integration
Transaction:
Cash Withdrawal / Purchase
ATM or POS
System
The 3 States of Availability: Systematic View
Unplanned outage
Migrations
Upgrades
System Failure
Data Failure
#1: Active
#2: Planned
Outage
Maintenance
#3: Unplanned
Outage
Performance, Latency,
Scalability
Operational Application
Goals of an Active/Active Implementation
� Better use of existing hardware
� Put your backup system to use
� Continually test backup system
� It is working right now
� Reduce response time
� Handle peaks – each processing a portion of load
� Allows phased migrations
� Once you have the ability to process on two systems, you can
perform phased migrations
This is NOT “Active Active”
A B DC
GFE H
A’ B’ D’C’
G’F’E’ H’
This is “Active Active”
A B DC
GFE H
A B DC
GFE H
Synchronous Replication Compromises Performance!
� You have to block and wait transactions until the same work is done twice and then confirmations are exchanged
� Combined throughput of two synchronous systems will be less than each system on its own
� Your response times will always be higher than without synchronous replication
� Each additional system will slow down the combined
System A
2000 TPS
System B
2000 TPS
System A+
System B=
<2000 TPS
synchronous
System A
2000 TPS
System B
2000 TPS
System A+
System B=
<4000 TPS
asynchronous
Synchronous Replication Reduces Overall Availability!!!
� In synchronous mode, if one system is unavailable the other has to wait until both systems are available again
� If one system slows down, it slows down both systems
� If you configure time-out/promiscuous option you forfeit the “zero data loss”
� For planned downtime, you need to switch to asynchronous mode – switching between asynchronous and synchronous requires outage on both systems to ensure consistency
Outage
Outage Outage
Outage
O
System A
System B
Overall AvailabilityMonday Tuesday Wednesday Thursday Friday Saturday Sunday
Outage
Outage Outage
OutageSystem A
System B
Overall AvailabilityMonday Tuesday Wednesday Thursday Friday Saturday Sunday
OutageOutage Outage
Asynchronous Replication
Synchronous Replication
Conflict Resolution Approaches
� Exception handling / management
� Human intervention
� Automated approaches
� Simple automated approaches
� Timestamp
� Trusted source / site priority
� Hybrid of timestamp and site priority
� Complex automated approaches
� Quantitative resolution
� Complex rules-based resolution
Technology Overview
Transactional Data Management (TDM) Software Platform
How GoldenGate TDM Works: Modular “Building Blocks”
LAN / WAN /
Internet
Source
Database
Target
DatabaseBi-directional
Trail files: Universal data format enables heterogeneity.
Route: No distance constraints via TCP/IP. Compression & encryption.
Capture: Committed changes are captured (and can be filtered) as they occur by reading the transaction logs.
Delivery: Applies transactional data
with guaranteed integrity.
CaptureSource Trail Target Trail
Source TrailTarget Trail
Deliver
DeliverCapture
Unidirectional
Reporting Instance
Bi-directional
Instant Failover, “Active”
Peer-to-Peer
Load Balancing, HA, DR
Broadcast
Data Distribution
Consolidation
Data Warehouse
GoldenGate TDM: Flexible Configurations
Cascading
Data Marts
GoldenGate TDM: Heterogeneity Supports Applications Running On…
Databases O/S and Platforms
Capture:
� Oracle
� DB2 UDB
� Microsoft SQL Server
� Sybase ASE
� Teradata
� Enscribe
� SQL/MP
� SQL/MX
Delivery:
� All listed above
� Ingres, MySQL
� and any ODBC compatible databases
Windows 2000, 2003, XP
Linux
Sun Solaris
HP NonStop
HP-UX
HP TRU64
IBM AIX
IBM z/OS
OpenVMS
GoldenGate for NonStop version 9.5
� SQL/MX platform support
� SQL/MX Log Based Capture
� SQL/MX ODBC Apply
� G06.x and H06.x operating systems
� NS14000 and NS16000
� NS1000 for Live Reporting
� GoldenGate is certified on Neoview
www.gravic.com
Shadowbase Business Continuity Update
ITUG BC SIG April, 2008
®
Paul J. Holenstein
Executive Vice President
Dick Davis
Shadowbase Sales Manager
Gravic, Inc.
www.gravic.com
Agenda
� Introduction to Gravic & Shadowbase Overview
� Business Continuity Overview
� Business Continuity Solutions
� Shadowbase Case Study
�One Product, Many Uses…
� For More Information…
Questions? Please ask as we
go along…
www.gravic.com
� Business Continuity & Availability� Disaster Recovery (Uni-Dir Active-Passive Architectures)
� Cooperative Processing (Bi-Dir Active/Active Architectures)
� Eliminate Application Down-time for Migrations & Upgrades (ZDM)
� Data Integration and Synchronization� Homogeneous & Heterogeneous Environments
� Data Transformation, Scrubbing, Filtering, & Cleansing
� Real-time Business Intelligence
� Utility Uses� Send Change Data vs. Loading Target via Replication
� Restore Corrupted Databases On-line
� Audit Compliance Reporting and Analysis
� Test Database Creation, QA Database Refresh, etc.
� Event Trigger Processing via API; Publish/Subscribe Functionality
Shadowbase Replication is an Enabling, Extensible Technology!
Shadowbase Product Overview
www.gravic.com
®
®
HP NonStop
SQL/MP
SQL/MXEnscribe
HP NonStop
SQL/MP
SQL/MX*Enscribe
HP NonStop
SQL/MP
SQL/MX*Enscribe
AS400
DB2
HP NonStop
SQL/MP
SQL/MXEnscribe
VMS
Oracle Sybase
Windows
OracleSQL
ServerDB2
Sybase
Unix/Linux
OracleMySQL
(Q2, 08)DB2
Unix, Windows
SQL
ServerOracle Sybase
Unix, Windows
SQL
ServerOracle Sybase
Unix, Windows
SQL
ServerOracle Sybase
Shadowbase Product Overview
*Native
SQL/MX
Bi-Dir Q3
2008
www.gravic.com
� Shadowbase Supports Multiple Target Database Initial
Loading and Target Re-Synchronization Options
� Shadowbase AutoLoader (Online Loader)
� Leverages Replication Engine Transformations and Transport
� (New) Shadowbase SOLV Utility (Online or Offline, & Snapshot)
� Higher Performance Loading, Supports Nonaudited Sources
� Verification & Validation (2008)
� Resynchronization (2009)
� Non-Shadowbase Loading Options Supported
� FUP DUP, Backup/Restore, Export/Import, etc (Offline/Static Target
DB Creation)
� “Fuzzy” Loading Technique (Offline Loading) - Replication changes
queue, are replayed once load completes
Shadowbase Product Overview
www.gravic.com
Active/Active
Architecture
BC Sweet Spot
4A (ASYNC)
Business Continuity
Replication Characteristics Affect RPO/RTO Levels that can be Attained
1. Replication Latency – affects RPO (amount of data loss at failure)
2. Async vs Sync Replication – improves RPO, but adds “Application Latency”
3. Target Applications Active – improves RTO and usability of target DB (e.g. queries)
4. Active/Passive vs Active/Active Architecture – dramatically improves RTO & RPO
RTO (faster recovery)
RPO
(less, or no,
data loss) Inactive Target
Applications
High-Latency
Replication
Low-Latency
Replication
ASYNC
Replication
SYNC Replication
(Including Split Mirrors)
Active Target
Applications
Active/Passive
Architecture
1
2
4B (SYNC)
3
Business Continuity Overview
No Data Loss,
Fastest Recovery
Active/Active
Architecture
Little Data Loss,
Fastest Recovery
Higher
Availa
bility
www.gravic.com
1APPL
2Shadowbase
3
� Bi (or Multi) Directional Replication
� Data updates need to be sent and applied to the other
database copies to keep them synchronized
� Issues
� Require application database changes (SB does not)
� # nodes limited or can it scale? (SB scales…)
� Avoid data ping-pong/oscillation (SB patents)
APPL1Shadowbase
2
3
Shadowbase Replication Technology
www.gravic.com
� Asynchronous Replication (Current Shadowbase)
� Replication decoupled from application processing (runs
independently of application)
� Issues
� Replication Latency determines data loss at failure (RPO)
� Data collisions can occur (during Replication Latency window)
� Need fast replication engine to minimize (SB is process-process)
� Or Special Architectures/Algorithms to Avoid (Application Issue)
APPLAPPL
1 Shadowbase
21
2Shadowbase
Shadowbase Replication Technology
www.gravic.com
� Synchronous Replication (Shadowbase Future)
� No (committed) data loss at failure of a node
� Data collisions are avoided (become lock waits)
� Issues
� Replication is part of the source application’s transaction
� Adds Application Latency
� What to do if network or target system is down?
� So called Split Brain syndrome
APPL APPL
1 Shadowbase2
Shadowbase
34
1
4
Shadowbase2
3
Shadowbase
Shadowbase Replication Technology
www.gravic.com
� The “How to” of Efficient Synchronous Replication is the
Really Tough Part…
� Implemented approach will dramatically affect Application Latency
� (Automatic) Recovery is more complex
� Older implementations generally implemented Dual Writes or
Distributed Lock Management (generally slow & high overhead)
� Newer/more efficient algorithms improve on earlier weaknesses…
Shadowbase Replication Technology
www.gravic.com
TARGET
DATABASE
SOURCE
DATABASE
2 TMF
Application
System \LEFT System \RIGHT
Asynchronous Replication (Shadowbase Current)
Shadowbase Replication Technology
1
I/O
4
5
TMF
Audit
Trails
3 TMF
(Data)
Shadowbase Shadowbase
www.gravic.com
11 Verify
Ready to
Commit1 Begin TX
10 Phase 1Vote
TMF
4
I/O
TARGET
DATABASE
SOURCE
DATABASE
5 TMF
Application
System \LEFT System \RIGHT
Synchronous Replication (Shadowbase Future)
Shadowbase Replication Technology
7
8
TMF
Audit
Trails
6 TMF
(Data)
Shadowbase Shadowbase
9 CommitRequest
3 SB Join TX
13Allow
comm
it?
Yes/N
o Reply
NOTE: Future HP and Gravic plans are subject to change without notice.
2 TX ID
12Allow commit?
Yes/No Reply
14 Commit CompleteResponse
www.gravic.com
AOL
� Database Change Monitoring, Event
Notification Subsystem
� Zero-Downtime Migration (onto NonStop)
� Bi-Directional Active/Active Business
Continuity Login Complex
AOL, LLC
Shadowbase Case StudyImplementations at AOL
www.gravic.com
AOL
Shadowbase Case StudyImplementations at AOL
1) Database ChangeNotification
Customer
Database
Shadowbase
AOL
Applications
Audit
Trail
Notifications
Login Requests Login Requests
Login Requests Login Requests
Shadowbase
Shadowbase
Shadowbase ShadowbaseShadowbase
3) AOL Active/Active Login Request Complex
Sybase
DR
Complex
......................
......................
16 Systems
Linux/Sybase “Partitioned” Database
Shadowbase
Active Logins
Shadow
base
DR Lo
gins
4-Node
Active/Active
NonStop
Complex
2) AOL Migration (Sybase → NonStop)
Login
Requests
www.gravic.com
For More Information
Breaking the Availability Barrier Series:� Volume 1: Survivable Systems for Enterprise Computing
� Volume 2: Achieving Century Uptimes with Active/Active
Systems
� Volume 3: Active/Active Systems in Practice
"The length of this document
defends it well against the risk
of its being read“
- Winston Churchill
www.gravic.com
Call, or see us at HPTF to discuss your requirements
Gravic, Inc.
301 Lindenwood Boulevard
Suite 100
Malvern, PA 19355 USA
Tel: +1.610.647.6250
Fax: +1.610.647.7058
www.gravic.com
www.gravic.com
Utility slides follow…
®
www.gravic.com
Business Continuity
� What is Active/Active?
� 2 or More Independent Processing Nodes;
� Each Running a Common Application; with
� 2 or More Copies of the Application’s Database, that are
� Kept in Synchronism via a Bi-Directional Data Replication
Engine
…Allow Any Node to Update Any Data Item? Your Choice…
Business Continuity Overview
Shadowbase
Shadowbase
UsersUsers
1
2
3
4
5
6
Verification
Periscope
Data Protection . Real-Time Access . Business Intelligence
7