dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/dealing...

20
Dealing with schema changes on large data volumes Danil Zburivsky MySQL DBA, Team Lead

Upload: others

Post on 25-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

Dealing with schema changes on large data volumes

Danil Zburivsky

MySQL DBA, Team Lead

Page 2: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

2 © 2010/2011 Pythian

Why Companies Trust Pythian• Recognized Leader:• Global industry-leader in remote database administration services and consulting for Oracle,

Oracle Applications, MySQL and SQL Server

• Work with over 150 multinational companies such as Forbes.com, Fox Interactive media, and MDS Inc. to help manage their complex IT deployments

• Expertise:• One of the world’s largest concentrations of dedicated, full-time DBA expertise.

• Global Reach & Scalability:• 24/7/365 global remote support for DBA and consulting, systems administration, special

projects or emergency response

2

Page 3: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

3 © 2010/2011 Pythian

Agenda

•Why schema changes are painful on large data sizes?

•Using standby for schema migrations•“Shadow” tables approach

3

Page 4: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

4 © 2010/2011 Pythian

Schema changes are slow

•Most of the schema changes on InnoDB tables require table rebuild:•Add index•Add new column•Drop column•Rename column

•Table is locked during rebuild•Becomes a problem when table is 20G in size•There are some improvements in InnoDB plugin, but they don’t solve the problem in general

4

Page 5: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

5 © 2010/2011 Pythian

Using standby for schema changes

•SET SQL_LOG_BIN = 0;•Apply schema changes on standby•Failover application to standby

5

master-master

Primary Standby

On Standby:

Page 6: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

6 © 2010/2011 Pythian

Using standby for schema changes

•Works fine for adding indexes•Not as good for adding new columns•Doesn’t work for renaming or dropping columns: replication will break

6

Page 7: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

7 © 2010/2011 Pythian

“Shadow” table approach

•Create new empty table with similar structure•Apply schema changes on new table•Copy data from original table to new table•Synchronize using triggers•Swap tables

7

Page 8: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

8 © 2010/2011 Pythian

Use case

•System with about 500G of InnoDB data•Major application release affecting about 30 tables•Largest one is 20G in size•Database changes included: new columns, renaming columns, deleting columns, new indexes and complex data transformations

•Estimated time for applying all changes directly was 7 hours

•Using “shadow” tables database changes were applied in about 1 hour

8

Page 9: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

9 © 2010/2011 Pythian

The process

CREATE TABLE t_new LIKE t_original;

ALTER TABLE t_new ADD COLUMN AB VARCHAR (100);

9

CREATE TABLE `t_original` ( `id` int(11) NOT NULL, `A` varchar(50) DEFAULT NULL, `B` varchar(50) DEFAULT NULL, PRIMARY KEY (`id`)) ENGINE=InnoDB DEFAULT CHARSET=latin1

Page 10: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

10 © 2010/2011 Pythian

Triggers to keep data in sync

LOCK TABLE t_original WRITE;

CREATE TRIGGER t_original_ai AFTER INSERT ON t_originalFOR EACH ROWREPLACE INTO t_new (id, A, B, AB) VALUES (NEW.id, NEW.A, NEW.B, CONCAT (A,',',B));

10

Page 11: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

11 © 2010/2011 Pythian

Triggers to keep data in syncCREATE TRIGGER t_original_ad AFTER DELETE ON t_originalFOR EACH ROWDELETE FROM t_new WHERE id = OLD.id;

CREATE TRIGGER t_original_au AFTER UPDATE ON t_originalFOR EACH ROWUPDATE t_new SET id = NEW.id,A = NEW.A,B = NEW.B,AB = CONCAT (A,',',B)WHERE id = OLD.id;

UNLOCK TABLES;

11

Page 12: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

12 © 2010/2011 Pythian

Copy data

12

t_original t_new

MIN(id)

MAX(id)

INSERT IGNORE INTO t_new(....) SELECT ... FROM t_original WHERE id>=? LIMIT N

Page 13: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

13 © 2010/2011 Pythian

Copy data. Sample code$lastId=$minid;$sql=<<SQL;INSERT IGNORE INTO t_new(id, A, B, AB)(SELECT id, A, B, CONCAT(A,',',B) )FROM t_originalWHERE id>=? LIMIT 5000)SQL

my $sth1 = $dbh->prepare($sql);while ($rv > 1){$dbh->do(‘START TRANSACTION’);$sth1->execute($lastId);

13

Page 14: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

14 © 2010/2011 Pythian

Copy data. Sample code$sth = $dbh->prepare("SELECT id FROM t_original WHERE id >='$lastId' LIMIT 5000");$sth->execute();$rv = $sth->rows;

while ((my $nextId) = $sth->fetchrow_array()){$lastId = $nextId;$totalrows = $totalrows + 1;}dbh->do(‘COMMIT’);print "Print rows inserted $totalrows. Next id=$lastId\n";}

14

Page 15: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

15 © 2010/2011 Pythian

Basic checksSELECT COUNT(*) FROM t_new UNION SELECT COUNT(*) FROM t_original;

SELECT MAX(id), MIN(id)FROM t_newUNIONSELECT MAX(id), MIN(id)FROM t_original;

15

Page 16: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

16 © 2010/2011 Pythian

During the release

RENAME TABLE t_original TO t_old;RENAME TABLE t_new TO t_original;DROP TRIGGERS;

16

Page 17: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

17 © 2010/2011 Pythian

Limitations

•Requires a unique key•No existing triggers•Need enough disk space for “shadow” tables

•Foreign keys

17

Page 18: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

18 © 2010/2011 Pythian

Foreign key issueIf there is an existing FK on t_original:

FOREIGN KEY (`fkId`) REFERENCES `t_original` (`id`)

It will be changed after RENAME to:FOREIGN KEY (`fkId`) REFERENCES `t_old` (`id`)

Solution is a hack:SET FOREIGN_KEY_CHECKS=0;DROP TABLE t_old;RENAME TABLE t_original TO t_old;RENAME TABLE t_old TO t_original;

18

Page 20: Dealing with schema changes on large data volumesassets.en.oreilly.com/1/event/56/Dealing with... · •System with about 500G of InnoDB data •Major application release affecting

20 © 2010/2011 Pythian

Q & A

Thank you!

[email protected]• http://www.pythian.com/news/author/zburivsky/• Twitter: @zburivsky

20