faster than alter – less downtime chris schneider
TRANSCRIPT
Overview
Some problems
Alter table
Load Data INFILE
Fragmentation
De-Fragmentation
Some tools to consider
Use of slave servers
Real world application
- Schema change
- Defragmentation
Some Problems
DDL can be a pain in a working system
Need to add or remove a column
Need to add or remove a clustered or secondary key
Running into performance degradation
- You’ve tried tuning the my.cnf but still need better performance
Alter Table
Alter table is very useful but can be unmanageable with large data sets
Everyone in here should about alter table!!!!!!!
Classic INNODB
- ALTERs will run a table rebuild
- Innodb Table space will grow
Innodb Plugin (Bonus)
- Table space will be rebuilt in place WHEN:
-- Dropping a secondary index
-- Adding a secondary index
Load Data Infile
Load Data Infile is very fast and pretty easy to work with
Most people in here should know about LOAD DATA INFILE!!!
You should also know about SELECT INTO OUTFILE
Also the tab delimited mysqldump
Working with TSV and CSV files
- Make sure when you export you do it in PK order
Fragmentation
Happens over time
Write activity creates holes and degrades performance
You did not load your data set initially in PK order
Can happen by accident
- Update table set column1 = 2 where column1 = 1;
Defragmentation
You will see performance gains from a dump out and reload
- 30% - 80% increase in overall throughput
For MyISAM and INNODB setups you will regain disk space
Upper management will be pleased with the sudden increase in database performance
Your customers will also be pleased, we HOPE!
Depending on your server count and setup you might only have to do this twice a year or less
Defragmentation (Cont)
Tools to Consider
mk-parallel-dump
- Dump tables in parallel
mk-parallel-restore
- Load tables in parallel
mk-fifo-split
- Split files and pipe lines to a fifo without really splitting
Select into outfile
LOAD data infile
MySQL Tab delimited dump
Using Slave Servers
Slave servers can be used to run alters or run an dump and reload of data
Some Considerations
- Do you have a slave?
- Do you have a good failover mechanism
- Semi-Automated
- Full-Automation
The General Idea:
1. ON the slave, Stop slave, Select INTO OUTFILE, LOAD DATA INFILE
2. Fail over from the master to the slave
3. Repeat step 1 on OLD MASTER
Real World Applications
Example 1: NEW COLUMN and INDEX needed
I have a system with 10, two node clusters, one master and one slave in each cluster
Each cluster has 10 databases on it with an average dataset of 70GB
The master servers cannot go down or be locked or an extended period of time. In this case I was given a 10 minute window (off hours) for a maintenance window
Server Specs
Dell 2850
16GB RAM
Dual Quad – Intel 2.50GHz
8 x 300GB 10K SATA RAID 10
Real World Applications (CONT)
Example 2: Defragmentation
Did you know that it takes 2 years to paint the Golden Gate bridge?
I have a system with thousands of servers that have been running a 60/40 (READ/WRITE) ratio for years
Each system is INNODB and has, on average, 170GB of data and index
Server Specs
HP DL380 G5
32GB RAM
Dual Quad – Intel Xeon L5420 @ 2.50GHz
8x146GB 15K SAS
Two RAID sets
- 2 disk mirror for OS
- 6 disk RAID 10 for DATA