- oracleinsert picture here> ... performance tuning employ features in applications ... •...
TRANSCRIPT
<Insert Picture Here>
Top 10 Lessons Learned in Deploying the Oracle Exadata
Dan Norris
Oracle X Team, Oracle Development
Aaron Werman
Architect, Bank of America Merrill Lynch
3
<Insert Picture Here>
Agenda
Prepare
Educate yourself
Migrate
Performance tuning
Employ features in applications
Customer: Bank of America Merrill Lynch
4
Prepare – System Architecture/Design
Plan for high availability and disaster recovery
Add proper test and development systems
Follow best practices from
http://www.oracle.com/goto/maa
5
Prepare - Application
Investigate and test target database release (currently
11g Release 2)
Gather baseline performance data, explain plans
6
Prepare - Stakeholders
Identify *all* of them
Brief them and *require* their input
Set realistic expectations
Adapt the enterprise support structure
Components are familiar
May find non-technical challenges
Assign most qualified engineers for support
Arrange training early in the project
Linux, RAC, ASM, Exadata, RMAN, Data Guard
7
Migrate
Find the proper methods
SLA may dictate some choices
Opportunity to adopt best practices
Test the results and capture timing
8
Performance Tuning
Use the best tools
SQL Tuning Sets, Database Replay
Identify tuning methods
Profiles
Query rewrite
Parallelism
Best practice: remove release-specific or underscore
parameters during upgrades
9
Prepare - Stakeholders
Identify *all* of them
Brief them and require their input
Adapt the enterprise support structure
Arrange training early in the project
Linux, RAC, ASM, Exadata, RMAN, Data Guard
Lessons Learned Migrating a Major Application to Exadata v2
Aaron Werman
Bank of America
Disclaimers
• All opinions are those of the author
• No endorsements are intended: this is a technical presentation intending to help others following in a similar path
Application Use Case Overview
• Complex integration of capital markets trading data
• Hundreds of ETLs, Thousands of tables
• 10K+ ETL executions per day, many highly complex
• Near real –time SLAs
• ODS with data sharing for entire line of business
• Several web applications, each with multiple hundreds of users, doing reporting and analytic queries
• Business, not traditional BI, SLAs for availability and recovery
Evaluation Considerations
• Very uncomfortable with technology risk!
• “Zero risk POC”
– Run entire production load
• Migrate full application
• Test with full production volume for a long period (1 month)
– Scale to identified 3 year growth goals
• Run a week of load in a day
– Determine how long it takes to run a day of load in a compressed period (see how overlap affects performance)
• Reporting scaling goal: 25K reports/hour
• Recovery times: RTO goal 2 hours
Alternatives Considered and Rejected
• Key Issue: mixed workload. Obvious candidates are Teradata, DB2, Netezza, Oracle for ODS and a column database for DW
• General concern: migration from Oracle entails time/cost and risk. Application is highly tuned to Oracle physical design
• Teradata concern: too small a use case for their sweet spot
• DB2 concern: migration, organizational issues, and new support issues
• Netezza: recovery model, fact/fact joins, mixed workload, tuning, LOB support
• New column database for reporting: organizational risk and current ODS I/O bottlenecks
• No other POCs were done! We chose Exadata based on migration risk avoidance
Exadata Justification for the Application
• Migrating from Oracle to another platform migration entails significant schedule risk
• Scaling Oracle ourselves is not justified by cost / risk / technology stack (but may be less painful in terms of corporate architecture)
• Potential loss of business capability and likely miss of critical SLAs if we do not scale adequately
• Current gaps in corporate SAN engineering to support VLDB (and 100TB applications)
Prep Work: Oracle 10g to 11g
Conversion/Validation
• We created a small copy of the app in Oracle 11g and tested for functional gaps
• No issues were raised
– despite some of our stack (Informatica v8.1) not being certified for 11g)
Data Migration
• Key issues are time to migrate and disk space requirements and complexity
• We rejected RMAN Oracle 10g single instance → Exadata
– Requires migration to 11g, ASM, RAC… too many steps
• We chose to use exports:
– data pump (network) for almost everything
– export classic for large LOB tables
• Be careful about considering ASM storage in FRA as a file copy target – there are limitations; review with Oracle!
Exadata Target State Architecture
X
Informatica
Cognos
Java
IIS, .NET
Oracle Physical Data
guard (TCP/IP)
Primar
y
DR
MQ, WMB EMC SRDF Synchronous Replication
file
s
file
s
file
s
file
s
file
s
X
Informatica
Cognos
Java
IIS, .NET
MQ, WMB
file
s
file
s
file
s
file
s
file
s
Bugs
Some major bugs encountered:
• 9356344 High CPU utilization of orarootagent.bin process with CRS-2409
• 9338087 ASM AND DATABASE HANG - CONNECT: OSSNET: CONNECTION FAILED TO SERVER, RESULT=5
• 9324531 ORA-00600: internal error code
These are now corrected as part of the current Exadata Oracle release
Performance Tuning
• Most due to SQL optimization differences between Oracle 10g and 11g
• ~40,000 SQL statements in app68 statements identified as substantially slower
_______________________________________
37 considered non-SLA relevant and ignored
31 important SQL statements with significant SLA impact
• 26 resolved using profiles
• 3 resolved using hints
• 2 resolved by query rewrite
• optimizer_use_sql_plan_baselines?
• Note that most statements improved in performance, and improved in proportion to how much work/time they took
RAC Tuning
• Our DBAs, based on prior RAC strategies, initially partitioned the app to segregate load and prevent potential lock/block overhead
• 8K page size only tested
• After tuning, we determined there was no gain, and all load was allowed across all nodes
– Your mileage may vary
Support Model
• Really complex to implement in our enterprise
– Disruptive technology requires change in strategy for many stakeholders, especially infrastructure support groups
• Include time in your plan to allow for the transition
• Include ALL stakeholders in your planning
• Backups:
– NAS/NFS... RMAN using a certified agent…
• Who manages the SA role?
• Monitoring (corporate standards vs. Oracle practice…)
– SNMP
Application Design Futures Based on Exadata
• Application changes, such as reducing our real time ETL SLAs by 2/3s
• Index removal
– We will experiment and remove many “for purpose” indexes
– Incremental strategy with sufficient testing required
• ILM using Hybrid Columnar Compression
• Reducing duplication of data between operational and reporting requirements
• Likely BI (read-only reporting) against disaster recovery site using Active Data Guard
26
Upcoming Exadata SessionsTuesday, September 21
2:00 pm – 3:00 pm
Future of the Oracle Exadata: Developments in OLTP, Warehousing, Consolidation (S316825)
Moscone South, Room 306
3:30 pm – 4:30 pmOracle RAC on Sun Oracle Database Machine Customer Panel (S317090)
Moscone South, Room 308
5:00 pm – 6:00 pm
Enterprise-Class Online Transaction Processing (OLTP) on the Oracle Exadata (S316823)
Moscone South, Room 307
27
Upcoming Exadata SessionsWednesday, September 22
1:00 pm – 2:00 pm
Oracle Exadata Tips, Tricks, and Best Practices: Backup and Recovery (S316821)
Moscone South, Room 307
4:45 pm – 5:45 pmOracle Exadata Tips, Tricks, and Best Practices: Migrating to the Oracle Exadata (S316822)
Moscone South, Room 307
28
Upcoming Exadata SessionsThursday, September 23
12:00 pm – 1:00 pm
Oracle Exadata Technical Deep Dive: Architecture and Internals (S316820)
Moscone South, Room 103
3:00 pm – 4:00 pmThe X-Files: Managing the Oracle Exadata and Highly Available Oracle Databases (S316974)
Moscone South, Room 102