ibm infosphere optim workload replay success stories … ne db2 user group... · ibm infosphere...
TRANSCRIPT
© 2014 IBM CorporationMarch 27, 2014
IBM InfoSphere Optim Workload Replay
Success Stories and Best Practices
Howie Hirsch
Senior IT Specialist
Speaker Name [email protected]
© 2013 IBM Corporation2
Managing Production Changes More Efficiently
© 2013 IBM Corporation3
Requirements
Benefits
• Minimize unexpected production problems
• Shorten testing cycles
• Develop more realistic subsystem testing scenarios
• Identify subsystem problems sooner with validation reports and performance tuning
• Use actual production workloads for testing rather than fabricated scenarios
• Extend quality testing efforts to include the data layer
DB2 for z/OS target subsystem or data
sharing group
DB2 for z/OS source subsystem or data
sharing group
Capture production workloads and replay them in testing environments
Application
InfoSphere Optim Workload Replay for DB2 for z/OS
Record
Play
Also available for DB2 LUW and other distributed servers!
InfoSphere Optim Workload Replay
© 2013 IBM Corporation4
DB2 10 for z/OS
ProductionInfoSphere Optim Workload Replay for DB2 for z/OS
DB2 11 for z/OS
Test
Record
Play
CaptureCapture
PreparePrepare
ReplayReplay
Compare and
Analyze
Compare and
Analyze
� Capture all components needed for real-life workload simulation
� Prepare workload replay
� Replay workload
� Compare replay with original capture or subsequent replays
� Validate correct SQL execution behavior
� Identify performance regressions and/or improvements
� Establish baseline; introduce changes and analyze impact
Solution Overview
© 2013 IBM Corporation
� S-TAP for DB2 for z/OS� Installed with each data sharing
member� Intercepts subsystem traffic
� Workload Replay controller� Controls starting and stopping of
OWR S-TAP, on as-needed basis� Drives “Local Replay”
� Workload Replay server� Software component� Built on Guardium v9� Processes, replays and analyzes
workloads
� Web based user interfaces
Architecture
© 2013 IBM Corporation6
Web Console Overview
� Use to manage the capture replay workflow and security
� Connection profiles provide tooling access to capture and replay systems
� Access to workflow tasks can be restricted
© 2013 IBM Corporation7
� S-TAP for DB2 for z/OS– Activated/deactivated by Workload
Replay controller– Captures local and remote inbound
subsystem traffic
– Optionally applies filtering
– Sends collected information to the Workload Replay server, where it is
stored
� Output: captured workload, metadata required to replay the workload and metrics
� Not collected: statistics, catalog information, data
Step 1: Capturing a Workload
© 2013 IBM Corporation8
� Use optional filter conditions to restrict captured SQL
� Filter types: – Authorization ID, plan, package, schema, special registers, …– Connection type (e.g. DRDA, CICS Attach, …)
� SQL is only captured if it meets all filter conditions
� Filtering is performed by S-TAP, thus reducing network traffic
Filtering Workloads During Capture
© 2013 IBM Corporation9
� Workloads can be captured and replayed in SYSPLEX environments
� Requires S-TAP to be installed on each data-sharing member
� Capture occurs only on selected members that are active
� Number of captured members does not have to be identical to the number of members used during replay
Capturing Workloads in Data-Sharing Environments
© 2013 IBM Corporation10
� Prior to starting capture, Workload Replay can optionally invoke a stored procedure that could clone a DB2 for z/OS subsystem
� A sample data cloning method is provided for DB2 Cloning Tool stored procedure
� Requires DB2 Cloning Tool to be pre-installed and configured on source systems
� Cloning Tool entitlement is not included in Workload Replay license
Optional: Cloning a z/OS Subsystem
© 2013 IBM Corporation11
Reviewing the Captured Workload
� Capture report provides aggregated workload information:– SQL count, unique SQL statements, transaction count, metrics, …
� Captured SQL can be exported as delimited text file to allow forcustom analysis
© 2013 IBM Corporation12
� Workload replay requires– Workload transformation (into
an optimized replay ‘format’)– Replay environment setup
� Workload transformation maps– Source to a target system– User credentials– Schema qualifiers– Collection IDs
� Setup target system such that it approximates the source system (how it was at the time of capture)
Step 2: Preparing the Replay
© 2013 IBM Corporation13
Reviewing the Replay-Ready Workload
� Transform report provides aggregated information about the replay-ready workload– SQL count, unique SQL statements, transaction count, metrics, …
� Transformed SQL can be exported as delimited text file
© 2013 IBM Corporation14
Step 3: Replaying the Workload
� S-TAP captures incoming traffic and collects metrics to allow for comparison with original capture or other replays
� Filtering can be applied
� Goal of first replay: establish baseline; are characteristics the same as for the original (production) capture?
� Workload is replayed by Workload Replay server, preserving original concurrency, timing and characteristics
� Replay speed can be adjusted to simulate different throughput
� S-TAP is automatically started by Workload Replay controller
© 2013 IBM Corporation15
� Prior to replay, Workload Replay can optionally invoke a stored procedure that resets the target systems
� A sample data reset method for DB2 Cloning Tool is provided
� Requires DB2 Cloning Tool to be pre-installed and configured on target systems
� Cloning Tool entitlement is not included in Workload Replay license
� Replay starts after stored procedure has completed successfully
Optional: Resetting the Replay System
© 2013 IBM Corporation16
Step 4: Analyzing Impact
� Accuracy report highlights SQL result differences (return code, row count, ...) between two workload executions
� Performance reports identify performance differences between two workload executions
� Analyze relevant differences and assess whether follow-up actions are necessary
� Tune workload or exclude problematic transactions if needed
� Repeat workflow as necessary
� Reports provide insights into how changes to the data server environment (or workload) impact workload replay accuracy and performance
© 2013 IBM Corporation17
Stage 1: Produce baseline in test environment– Compare production capture with first replay in test environment (capture vs.
replay comparison)
– Goal: validate that replay is accurate and representative of original capture
before any changes are introduced in the test environment
Accuracy
Performance
Step 4: Comparing and Analyzing Results
© 2013 IBM Corporation18
Stage 2: Analyze impact of changes in test environment– Compare baseline replay with another replay in test environment (replay vs.
replay comparison)
– Goal: Analyze impact that newly introduced changes have on workload
execution
Step 4: Comparing and Analyzing Results
Accuracy
Performance
© 2013 IBM Corporation19
� Summary report provides easy access to summarized information
Step 4: Analysis – Accuracy Report
© 2013 IBM Corporation20
� Drill-through reports provide more in-depth information
Step 4: Analysis – Accuracy Issues
© 2013 IBM Corporation21
� Charts visualize key workload characteristics
Step 4: Performance Analysis
© 2013 IBM Corporation22
Step 4: Performance Analysis
� Performance overview highlights SQL improvements or regressions
© 2013 IBM Corporation23
� Drill-through reports display aggregated information for each unique SQL that improved or regressed – Execution count, row count, total and average response time
� Workload can be exported in InfoSphere Optim Query Workload Tuner compatible format for tuning
Step 4: Performance Analysis
© 2013 IBM Corporation24
� Additional summarized details are available for each unique SQL
� In-depth information is availablefor TopN executions (regressionsor improvements)
Step 4: Analysis – SQL Performance Summary
© 2013 IBM Corporation25
� SQL execution information provides additional details, such as host variable values, special registers, etc
Step 4: Analysis – SQL Execution Details
© 2013 IBM Corporation26
� Capture/replay in DB2 for z/OS environments– Use when source and target are DB2 for z/OS
� Capture/replay in DB2-to-DB2 environments– Use when source and target subsystem are DB2 LUW (or zLinux)
� Capture/replay in heterogeneous environments– Assist with Oracle to DB2 LUW server migration
– Manage lifecycle changes for Informix, Netezza, Oracle, Teradata, SQL Server, Sybase and MySQL
Improve customer
satisfaction
Anticipate and correct
potential problems
sooner
Reduce cost of change
Establish consistent
subsystem testing
processes
Meet SLAs
Ensure well tuned, high
performing workloads
before deployment
In Summary
© 2013 IBM Corporation27
� Enterprise change testing in DB2 for z/OS: A confidence endeavor
http://www-01.ibm.com/common/ssi/cgi-
bin/ssialias?subtype=BK&infotype=PM&appname=SWGE_IM_IP_USEN&htmlfid=IMM14106USEN&attachment=IMM14106USEN.PDF
� Demo video on developerWorkshttp://www.ibm.com/developerworks/offers/lp/demos/summary/im-optimqueryfordb2.html
� Free eLearning course on developerWorkshttp://www.ibm.com/developerworks/offers/lp/demos/summary/im-optimquery-elearning.html
� Information roadmaphttp://www.ibm.com/developerworks/data/roadmaps/roadmap_caprep_21.html
27
Resources
© 2013 IBM Corporation28
Best practices
� Use DB2 Cloning tool to capture subsystem copy as close as possible to capturing workload
� Once you have captured the workload and prepared the test environment, replay the workload to create a baseline– This will be what you compare future runs to
© 2013 IBM Corporation29
Customer Success Stories
© 2013 IBM Corporation30
Use case and client mapping
XXLarge Brokerage Firm
XElectrical component Co
XXAutomobile Manufacturer
XXIT Operations and Services
Provider
XLarge Bank
XXLarge Insurance Co
Hardware
Performance
Validation (Add
zIIP, I/O
subsystems)
Production
Workloads
Complement
Application
testing
Migration
and PTF
testing
Reduce Risk
� Realistic workloads not fabricated workloads
� Identify problem areas before production deployment
� Increased visibility
Deploy on time
� Create tests from months to days
� Conserve need for mainframe resources
� Reduced test cycle time
Within budget/Increased Productivity
� No laborious script creation / Application setup
� Automated repetitive tasks
� Extended support / Cost of OS
© 2013 IBM Corporation31
Large Insurance Company
31
Software Industry
Key Benefit: Risk mitigation of DB2 for z/OS migration and Application changes
ChallengesBusiness | Meeting SLAs is critical for the business requiring application responsiveness and availability. Internet presence of the company is growing and need to ensure ongoing performanceTechnical | Current testing methodology does not create realistic test environments like production workloads.
Unexpected performance issues and error conditions were not found until production rollout.
Use Cases– DB2 for z/OS migration testing– DB2 for z/OS Data layer load testing to compliment application stakeholder tests– DB2 LUW AIX migration to RHEL
Business Drivers �Business agility and resiliency / Risk mitigation
� Deliver realistic testing by using actual production workloads for test� Identify and correct potential performance problems from enterprise changes faster� Minimize the production outages by getting deeper insights of the problem in the testing
phase and minimum impact to the customer as company direction is to do more online.� Shorten test cycles
�Accelerated time to deploy software changes� Efficiently manage lifecycle events such as changes in databases, applications, hardware,
minimizing production impact� Create realistic tests of realistic local and distributed workloads in days where previously
impossible� Single repeatable process for testing at the data layer to complement application
regression, functional, and performance tests.� Reduce laborious script creation / application setup / test setup
Why IBM� IBM differentiates from competitors as the only vendor with this capability.
Capabilities� Capture Distributed Production Workloads
� Impact Reports to identify Data Layer
problems
� Single repeatable process for testing
across heterogeneous systems.
Solution� Optim Workload Replay for DB2 on z/OS
and Distributed
EnvironmentPlatform
� System z, AIX, RHEL
Database
� DB2 for z/OS, DB2 LUW, other DBs
© 2013 IBM Corporation3232
Software Industry
Key Benefit: Risk Mitigation and Reduce cost of Infrastructure changes
ChallengesBusiness | Line of Business resistant to any infrastructure changes due to risk of sluggishness or downtimeTechnical | DBA Global team responsible for architectural strategy, infrastructure upgrades/patch management policy wants to upgrade to new capabilities, but upgrading instances and DBs is difficult, i.e. DB2 z/OS, hundreds of Distributed DB2, Oracle, SQL Server, Sybase databases. Difficult to get application owners time. Takes years to upgrade across the enterprise. There is limited functional testing and no performance testing.
Use Cases– DB2 for z/OS migration and maintenance testing– DB2, Oracle, Sql Server, Sybase version testing <Later>
Business Drivers �Accelerate deployment
� Reduce time it takes to upgrade across the enterprise� Create realistic workloads from months to days without reliance of application test
teams� Leverage Guardium 8.2 infrastructure
�Reduce Risk of DB upgrades, patch testing, migration testing� Deliver realistic testing by using actual production workloads for test� Identify and correct potential performance problems from enterprise changes before
production � Build business user confidence with impact reports
�Reduce Costs� Single repeatable process for testing at the data layer to complement regression,
functional, and performance tests.� Reduce the extended support cost by providing faster path to upgrade.
Why IBM� IBM differentiates from ORAT and iReplay as OWR leverages Guardium platform proven
in production today. OWR also differentiates with broad set of platform support.
Capabilities� Deliver realistic testing by using actual
production workloads rather than
fabricated tests
� Establish repeatable, test rigor across the
enterprise, accelerating deployment
�Leverages existing Guardium infrastructure,
proven in production today
Solution� Optim Workload Replay for DB2 on z/OS
EnvironmentPlatform
� System Z, AIX, Solaris, Windows,
Guardium 8.2
Database
� DB2, Oracle, Sybase, SQL Server
Large Bank
© 2013 IBM Corporation33
IT Operations & Engineering Service Provider
33
Software Industry
Key Benefit: Risk mitigation and Accelerated time to market
ChallengesBusiness | This company provides services to large Banks in a European country servicing 1/3 of the population. Meeting SLAs is critical for the business requiring application responsiveness and availability. The cost of an outage would be extremely serious. The cost can be as much as over a million Euros per day. Technical | To reduce operating costs, the IT Service provider manages 50 Data Sharing groups and 94 Subsystems by maintaining every instance with the same version level of DB2 for z/OS. Current methodology to test migrations and PTFs requires a full year’s effort for every major and maintenance version. In addition there is very limited performance /stress type testing because applications change as DB2 is upgraded. Performance tests need use the same functions as the prior release to be comparable.
Use Cases– Deeper testing of DB2 for z/OS maintenance test cycles. Client has four maintenance test
cycles/ rollouts per year including major upgrades– Assessment of impact of adding new zIIP processors and I/O subsystems
Business Drivers � Risk mitigation
� Helps dramatically reduce risk and deliver deeper level testing that includes realistic performance / stress testing
� Demonstrate that the SLA can be met before changes occur �Accelerated time to deploy software and hardware changes
� Dramatically reduce the overall time to migrate to new release by 75%, from 1Month X 94 Subsystems to 1 week X 94 subsystems.
� Dramatically reduce the overall time to test maintenance packs by 4:1, which occurs four times a year
� Single repeatable process for testing at the data layer � Shorten test cycles
Why IBM� IBM differentiates from competitors as the only vendor with this capability.
Capabilities
� Capture Local Production Workloads
� Impact Reports to identify problems
� Single repeatable process for testing
across heterogeneous systems.
Solution
� Optim Workload Replay for DB2 on z/OS
Environment
Platform
� System z, 2 EC12 2800 MSUs ea
Instances
� DB2 for z/OS, 94 Subsystems, 12M SQLs
for 2 hours
Client will be presenting Client will be presenting Client will be presenting Client will be presenting
at IDUG Barcelona, at IDUG Barcelona, at IDUG Barcelona, at IDUG Barcelona,
Spain 10/13 on OWRSpain 10/13 on OWRSpain 10/13 on OWRSpain 10/13 on OWR
© 2013 IBM Corporation34
Automobile Manufacturer
34
Software Industry
Key Driver: Reduce labor effort by 75% ChallengesBusiness | Large Services provider provides IT services to Automobile Manufacturer and other banks. The applications used by these clients reside on System Z using DB2 for z/OS. Meeting SLAs is critical for the business requiring application responsiveness and availability. Technical | For Automobile Manufacturer, it is estimated that it took a a total of 200 labor days of effort to migrate from DB2 v9 to DB2 v10 excluding application testing. With Application testing, it took 400 labor days. The elapsed time for the migration of all production systems was about 7 months and additional 2 months for development systems. Additionally, it takes about 2-4 weeks to test DB2 fix packs for each instance.
Use Cases– Automated robust test creation for DB2 for z/OS maintenance test cycles including moving
to DB2 v11 and ongoing fixpacks
Business Drivers
�Reduce FTE effort/Increased Productivity� For DB2 v10 -> DB2 v11 testing, client estimates that the current 200 man days effort will
be reduced to 50 man days effort� For ongoing fixpack testing, client estimates that the current 2-4 weeks effort would be
reduced to 1 week effort for each instance � Single repeatable process for testing at the data layer � Shorten test cycles
Why IBM� IBM differentiates from competitors as the only vendor with this capability.
Capabilities
� Capture Local Production Workloads
� Impact Reports to identify problems
� Single repeatable process for testing
across heterogeneous systems.
Solution
� Optim Workload Replay for DB2 on z/OS
Environment
Platform
� System z, 2 EC12 2800 MSUs ea
Instances
� DB2 for z/OS, 94 Subsystems, 12M SQLs
for 2 hours
Automobile Manufacturer
has run through end to end
testing using production
workload
© 2013 IBM Corporation35
Use case #2 – Data layer testing for Packaged applications
(e.g. SAP)
35
© 2013 IBM Corporation36
Electrical Component Company
36
Software Industry
Key Benefit: Risk mitigation and Accelerated time to marketChallengesBusiness | The electrical component company relies on SAP as their business critical application interconnecting with other business critical applications. Meeting SLAs is critical for the business requiring application responsiveness and availability. The cost of an outage is over $375K/hour.Technical | Instantiating a SAP environment for upgrade testing or troubleshooting is very difficult and labor due to tight mainframe availability windows. It would require 40+ hours over a weekend which is not possible. Optim Query Workload Replay capture of SAP would replace the need to instantiate a SAP system. OWR together with Omegamon for DB2 for z/OS would be used to measure performance before a migration from DB2 v10 to DB2 v11.
Use Cases
– DB2 for z/OS v11 testing and DB2 for z/OS maintenance test cycles using SAP workloads– Deeper level testing when changes are made to a database object or SAP upgrade – Troubleshooting
Business Drivers
� Risk mitigation � Dramatically reduce risk of over $500K/hour; Deliver deeper level testing that includes
realistic performance / stress testing� Reduce troubleshooting time by over 50%
�Accelerated time to deploy DB2 for z/OS and SAP changes� Dramatically reduce the overall time and cost to test SAP, without having to clone the SAP
environment. Cloning SAP environment is very difficult, labor intensive and cost prohibitive.
� Dramatically reduce the overall time to test DB2 for z/OS maintenance packs � Single repeatable process for testing at the data layer � Shorten test cycles
Why IBM� IBM differentiates from competitors as the only vendor with this capability.
Capabilities
� Capture distributed SAP Production
Workloads
� Impact Reports to identify problems
� Single repeatable process for testing
across heterogeneous systems.
Solution
� Optim Workload Replay for DB2 on z/OS
� DB2 Cloning Tool, Optim Query Workload
Tuner
Environment
Platform
� System z, two member data sharing
environment using SAP
� 300M-1B sql statements/day
Instances
� DB2 for z/OS
Upon successful Beta will
be a Reference client
© 2013 IBM Corporation37
Use case #3 – Performance Validation
37
© 2013 IBM Corporation38
Large Brokerage Firm
38
Software Industry
Key Benefit: Risk Avoidance
ChallengesBusiness | The first minutes of opening of the market and last minutes of close of the market are the two most critical periods of the day. Unforeseen outages during the peak load times caused millions of dollar revenue loss, SEC violations and negative press affecting Brand. “Zero Outage” initiative with hot standby.Technical | Inability to create tests that mirror distributed workloads on hot standby. Unexpected performance issues and errors of application changes were not found until production rollout.
Use Cases– DB2 for z/OS migration testing– Performance validation of hot back up system for distributed workloads
Business Drivers �Reduce Risk / Risk Avoidance
� Deliver realistic testing by using actual distributed production workloads for test where previously unable to do so
� Identify and correct potential performance problems of application lifecycle changes before deploying to production
�Accelerate deployment time� Rapidly respond to business requirements � Create realistic tests of distributed workloads in days where previously impossible� Single repeatable process for testing at the data layer to complement regression,
functional, and performance tests
Why IBM� IBM differentiates from competitors as the only vendor with this capability.
Capabilities� Capture Distributed Production Workloads
� Impact Reports to identify problems
� Shorten test cycles
Solution� Optim Query Capture and Replay for DB2
on z/OS
EnvironmentPlatform
� System z , EC 12
Database
� DB2 for z/OS
Application
© 2013 IBM Corporation39
© 2013 IBM Corporation40
Example: Workloads Created
� Established baseline (iteration 1) and assessed impact (iteration 2)
CaptureCapture
PreparePrepare
ReplayReplay
Compare and
Analyze
Compare and
Analyze
© 2013 IBM Corporation41
V2.1 New Features Highlights
� Import and export workloads– Capture and replay workloads in isolated or geographically separated
environments
� Filter captured SQL during replay– Reduce noise in comparison reports through selective capture
� Detailed capture of LOB and XML data– Choose between basic capture or capture of actual LOB and XML data for
more accurate replay where needed
� Map collection IDs during workload transformation– Replay workloads in environments where packages reside in different
collections