plmce 14 be a_hero_16x9_final

41
Percona Live Santa Clara 2014 Be the hero of the day with Data recovery for InnoDB Marco Tusa – Aleksandr Kuzminsky April 2014

Upload: marco-tusa

Post on 11-Aug-2014

107 views

Category:

Data & Analytics


0 download

DESCRIPTION

MySQL InnoDB data recovery. This presentation is covering the procedure to recover data from InnoDB when nothing else can help you.

TRANSCRIPT

Page 1: Plmce 14 be a_hero_16x9_final

Percona Live Santa Clara 2014 Be the hero of the day with Data recovery for InnoDB

Marco Tusa – Aleksandr Kuzminsky April 2014

Page 2: Plmce 14 be a_hero_16x9_final

Who

• Marco “The Grinch” • Manager Rapid Response • Former Pythian MySQL cluster

technical leader • Former MySQL AB PS (EMEA) • Love programming • History of religions • Ski; Snowboard; scuba diving;

Mountain trekking

2

Page 3: Plmce 14 be a_hero_16x9_final

What we will cover

• Recovery toolkit introduction

• Show how to extract data from IBD data file

• Attach ibd files after IBDATA corruption

• Recover deleted records

• Recover drop table

• Bad update

3

Page 4: Plmce 14 be a_hero_16x9_final

What is Percona Data recovery Tool for InnoDB

• Set of open source tools

• Work directly on the data files

• Recover lost data (no backup available)

• Wrappers (you can help)

4

A

Page 5: Plmce 14 be a_hero_16x9_final

What files?

• Server wide files – <Table>.frm

• InnoDB files – ibdata1

• InnoDB dictionary • UNDO segment • All tables if innodb_file_per_table=OFF

– <Table>.ibd – Reads raw partition

5

A

Page 6: Plmce 14 be a_hero_16x9_final

InnoDB file format

• Antelope – REDUNDANT (versions 4.x)

– COMPACT (5.X)

• Barracuda – REDUNDANT, COMPACT

– New time formats in BARRACUDA

– COMPRESSED

6

A

Page 7: Plmce 14 be a_hero_16x9_final

What is a InnoDB Table space?

Tablespace consists of pages • InnoDB page

– 16k by default – page_id is file offset in 16k chunks – Records never fragmented – Types:

• FIL_PAGE_INDEX • FIL_PAGE_TYPE_BLOB • FIL_PAGE_UNDO_LOG

7

A

Page 8: Plmce 14 be a_hero_16x9_final

InnoDB index (B+ Tree) 8

A

Page 9: Plmce 14 be a_hero_16x9_final

Requirements and minimal skill to use the tool

• You need to know how to compile (make)

• MySQL you know what it is right?

• How to import data from a tab separated file

9

M

Page 10: Plmce 14 be a_hero_16x9_final

Show the tool in action - Process

Process:

10

data extraction

Extract data From ibdataX

Read SYS_X Tables

Generate Table filters files

Extract data from ibd Tbspaces

Validate Data

Import data back

Final clean up

Production ready

Page 11: Plmce 14 be a_hero_16x9_final

Show the tool in action - page_parser

• Extract pages from InnoDB files

– (In case of innodb_file_per_table =0 it also extract real data)

– page_parser -f ibdata1 (or table space like employees.ibd)

11

Extract data from idbata

Page 12: Plmce 14 be a_hero_16x9_final

Show the tool in action – contraints_parser

• Extract data from InnoDB pages

• IE SYS_TABLE/INDEX/COLUMN

– bin/constraints_parser.SYS_TABLES -4Uf FIL_PAGE_INDEX/0-1

– bin/constraints_parser.SYS_INDEXES -4Uf FIL_PAGE_INDEX/0-3

12

Extract data from idbata

Page 13: Plmce 14 be a_hero_16x9_final

Show the tool in action - contraints_parser

Output SYS_TABLES "employees/salaries" 811

SYS_INDEXES 811 1889 "PRIMARY“

SYS_INDEXES 811 1890 "emp\_no"

Table ID Index Id

13

Read SYS_Tables/Indexes

Page 14: Plmce 14 be a_hero_16x9_final

Show the tool in action - sys_parser

Why:

Lost FRM file

Two possible ways:

• Easy: copy it from slave/backup

• Less easy: – Run sys_parser on a new create instance (info not accessible – require

dictionary table load)

14

Lost FRM and IBD (1)

Page 15: Plmce 14 be a_hero_16x9_final

Show the tool in action - sys_parser

Output: ./sys_parser -h192.168.0.35 -u stress -p tool –d <database>

employees/salaries

CREATE TABLE `salariesR`(

`emp_no` INT NOT NULL,

`salary` INT NOT NULL,

`from_date` DATE NOT NULL,

`to_date` DATE NOT NULL,

PRIMARY KEY (`emp_no`, `from_date`)) ENGINE=InnoDB;

15

Lost FRM and IBD (2)

Page 16: Plmce 14 be a_hero_16x9_final

Show the tool in action - ibdconnect

• Accidental removal of the IBDATA file

• IBDATA table space corruption

• Only file per table available (IE employees.ibd)

16

Attach Table from another source (1)

Page 17: Plmce 14 be a_hero_16x9_final

Show the tool in action - ibdconnect

What to do? 1. Start a new clean MySQL

2. Create empty structure (same tables definition)

3. Copy over the table spaces

4. Run ibdconnect

5. Run innochecksum_changer

17

Attach Table from another source (2)

./ibdconnect -o ibdata1 -f salaries.ibd -d

employees -t salaries

salaries.ibd belongs to space #15

Initializing table definitions...

Updating employees/salaries (table_id 797)

SYS_TABLES is updated successfully

Initializing table definitions...

Processing table: SYS_TABLES

Processing table: SYS_INDEXES

Setting SPACE=15 in SYS_INDEXES for TABLE_ID =

797

Page 18: Plmce 14 be a_hero_16x9_final

Show the tool in action – fix filters

Table filters use for:

• Identify the data inside the ibd

• Filter out the damage records

• Bound to table definition

• Must recompile for each table definition

• Generated with the tool create_def.pl

18

Generate Table filters (1)

Page 19: Plmce 14 be a_hero_16x9_final

Show the tool in action - page_parser filters

• Generated with the tool create_def.pl create_defs.pl --db=$schema --table=$table >

$myPath/include/table_defs.${schema}.$table.defreco

very

• Create symbolic link to include/table_defs.h

• Compile again

19

Generate Table filters (1)

Page 20: Plmce 14 be a_hero_16x9_final

Show the tool in action - constraints_parser

The data is extracted by the tool specifying the table space an possible BLOB directory.

/constraints_parser -5Uf FIL_PAGE_INDEX/0-${INDEXID} –b FIL_PAGE_TYPE_BLOB/ >

$DESTDIR/${SCHEMA}_${TABLE}.csv“

FIL_PAGE_INDEX/0-${INDEXID} is the ID of the PK

FIL_PAGE_TYPE_BLOB is the directory containing the BLOB

21

Extract data from Table space (1)

Page 21: Plmce 14 be a_hero_16x9_final

Show the tool in action - constraints_parser

Example of the data: -- Page id: 4, Format: COMPACT, Records list: Valid, Expected records: (164 164)

00000000150B 88000002260084 employees 10001 "1953-09-02“ "G" "eorgiF" "(null)" "12722-11-12"

00000000150B 88000002260091 employees 10002 "1964-06-02" "B" "ezalelS“ "(null)" "14006-11-05"

00000000150B 8800000226009E employees 10003 "1959-12-03" "P" "artoB" "(null)" "14003-03-15"

00000000150B 880000022600AB employees 10004 "1954-05-01" "C" "hirstianK""(null)“ "12598-03-09"

00000000150B 880000022600B8 employees 10005 "1955-01-21" "K" "yoichiM" "(null)" "13876-11-14"

<snip>

00000000150B 880000022608EE employees 10164 "1956-01-19" "J" "agodaB" "(null)" "12474-11-14"

-- Page id: 4, Found records: 164, Lost records: NO, Leaf page: YES

22

Validate data

Page 22: Plmce 14 be a_hero_16x9_final

Show the tool in action - LOAD DATA INFILE

How to import the data back?

Easy as: LOAD DATA INFILE ‘PLMC_employees/employees' REPLACE INTO

TABLE `employees` FIELDS TERMINATED BY '\t' OPTIONALLY

ENCLOSED BY '"' LINES STARTING BY 'employees\t' (`emp_no`,

`birth_date`, `first_name`, `last_name`, `gender`,

`hire_date`);

Done

23

Import data back

Page 23: Plmce 14 be a_hero_16x9_final

How recover deleted record -

Identify the records just for this exercise: select count(emp_no) from employeesR where hire_date > '1999-08-24';

+---------------+

| count(emp_no) |

+---------------+

| 279 |

+---------------+

1 row in set (0.23 sec)

And bravely delete them: delete from employeesR where hire_date > '1999-08-24';

Query OK, 279 rows affected (0.55 sec)

24

Delete records

Page 24: Plmce 14 be a_hero_16x9_final

How recover deleted record -

To recover deleted record we must use the –D flag:

constraints_parser -5Df /FIL_PAGE_INDEX/0-1975 -b /FIL_PAGE_TYPE_BLOB/ > employees_employeesDeleted.csv

cat employees_employeesDeleted.csv |grep -i -v -e "-- Page id"|wc -l

55680 Too many because I take unfiltered records

25

Recover delete records

Page 25: Plmce 14 be a_hero_16x9_final

How recover deleted record - name: "employeesR",

{

{ /* int(11) */

name: "emp_no",

type: FT_INT,

fixed_length: 4,

has_limits: TRUE,

limits: {

can_be_null: FALSE,

int_min_val: 10001,

int_max_val: 499999

},

26

Use filters to clean up results

name: "first_name",

type: FT_CHAR,

min_length: 0,

max_length: 42,

has_limits: TRUE,

limits: {

can_be_null: FALSE,

char_min_len: 3,

char_max_len: 42,

char_ascii_only: TRUE

},

can_be_null: FALSE

},

name: "last_name",

type: FT_CHAR,

min_length: 0,

max_length: 48,

has_limits: TRUE,

limits: {

can_be_null: FALSE,

char_min_len: 3,

char_max_len: 48,

char_ascii_only: TRUE

},

Page 26: Plmce 14 be a_hero_16x9_final

How recover deleted record -

Now let us recompile and run the extraction again: cat employees_employeesDeleted.csv |grep -i -v -e "-- Page

id"|wc -l

279 <------ Bingo!

27

Check if it fits and reload

Page 27: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

• Different methods if using innodb_file_per_table=[0|1].

– Must act fast because files can be overwritten quickly

• In the first case pages are mark free for reuse

• In the second the file is removed and we need to scan the device

28

Page 28: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

What we need then ?

• The table definition

• The PK index

– Parse the dictionary with the “D” flag • constraints_parser.SYS_TABLES -4D

• Extract the InnoDB pages

29

Page 29: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

Not using innodb_file_per_table method:

1. Extract the ibdataX as usual

2. Run contraints_parser constraints_parser -5Uf ./FIL_PAGE_INDEX/0-1975 -b

./FIL_PAGE_TYPE_BLOB/ > employees_employeesDroped.csv

cat employees_employeesDroped.csv |grep -i -v -e "-- Page id"|wc -l

300024 <---- done

30

Not using file per table

Page 30: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

Innodb_file_per_table=1 method:

What we need more?

• To know what device is containing the dropped table

31

using file per table

Page 31: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

Identify the PK id from dictionary: cat SYS_TABLE.csv|grep employeesR

SYS_TABLES "employees/employeesR" 855

cat SYS_INDEX.csv|grep 855

SYS_INDEXES 855 1979 "PRIMARY”

32

using file per table

Page 32: Plmce 14 be a_hero_16x9_final

How recover Drop tables -

This time we must run the page_parser against the device not the file using the T option.

-T -- retrieves only pages with index id = NM (N - high word, M - low word of id)

page_parser -t 100G -T 0:1979 -f /dev/mapper/EXT_VG-extlv

Parse in this case take longer.

33

Run the page extraction

Page 33: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

Doing with the tool is possible when new data is larger then the original, and will not fit in the original page,

otherwise the old one will be replaced, as such the only way is to parse the undo segment.

Tools: s_indexes, s_tables recover the dictionary from the undo.

34

Page 34: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

Other method:

Possible to use the binary log for that

When in ROW format .

35

Page 35: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

You can use binary log to recover your data if:

• Binlog format = ROW

• binlog_row_image = FULL (from 5.6 you can change it)

36

Prerequisite

Page 36: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

Assume a table like : +--------+------------+------------+-----------+--------+------------+

| emp_no | birth_date | first_name | last_name | gender | hire_date |

+--------+------------+------------+-----------+--------+------------+

| 10001 | 1953-09-02 | Georgi | Facello | M | 1986-06-26 |

| 10002 | 1964-06-02 | Bezalel | Simmel | F | 1985-11-21 |

| 10003 | 1959-12-03 | Parto | Bamford | M | 1986-08-28 |

+--------+------------+------------+-----------+--------+------------+

An action like : update employeesR set last_name="WRONG-NAME" where emp_no <

10010;

37

Scenario (1)

Page 37: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

You will have to recover something like: +--------+------------+------------+------------+--------+------------+

| emp_no | birth_date | first_name | last_name | gender | hire_date |

+--------+------------+------------+------------+--------+------------+

| 10001 | 1953-09-02 | Georgi | WRONG-NAME | M | 1986-06-26 |

| 10002 | 1964-06-02 | Bezalel | WRONG-NAME | F | 1985-11-21 |

| 10003 | 1959-12-03 | Parto | WRONG-NAME | M | 1986-08-28 |

38

Scenario (2)

Page 38: Plmce 14 be a_hero_16x9_final

How recover wrong updates -

with a simple command like : mysqlbinlog -vvv logs/binlog.000034 --start-datetime="2014-03-19

11:00:07"|grep -e "@1" -e "@4"|awk -F '/*' '{print $1}'|awk '{print

$2}'

@1=10001

@4='Facello'

@1=10001

@4='WRONG-NAME'

@1=10002

@4='Simmel'

@1=10002

@4='WRONG-NAME'..

39

Recover from binary log (2)

Page 39: Plmce 14 be a_hero_16x9_final

Reference and repositories

Main Percona branch: bzr branch lp:percona-data-recovery-tool-for-innodb

Marco branch:

https://github.com/Tusamarco/drtools

40

Recover from binary log (2)

Page 40: Plmce 14 be a_hero_16x9_final

Q&A 41

Page 41: Plmce 14 be a_hero_16x9_final

Contacts 42

To contact Marco

[email protected]

[email protected]

To follow me

http://www.tusacentral.net/

https://www.facebook.com/marco.tusa.94

@marcotusa

http://it.linkedin.com/in/marcotusa/

To contact Aleksander

[email protected]

To follow me

http://www.mysqlperformanceblog.com/author/akuzminsk

y/

https://www.linkedin.com/in/akuzminsky