the two little bugs that almost - percona filethe two little bugs that almost brought down...
TRANSCRIPT
![Page 1: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/1.jpg)
The two little bugs that almost
brought down Booking.com
Jean-François Gagné (System Engineer)
jeanfrancois DOT gagne AT booking.comApril 25, 2017 – Percona Live Santa Clara 2017
![Page 2: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/2.jpg)
2
For a, b and c relatively small:
Consequence(a + b + c) is much bigger than
Conseq.(a) + Conseq.(b) + Conseq.(c)
![Page 3: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/3.jpg)
MySQL/MariaDB replication at Booking.com
● Typical Booking.com MySQL/MariaDB replication deployment:
+---+| M |+---+|+------+-- ... --+---------------+-------- ...| | | |
+---+ +---+ +---+ +---+| S1| | S2| | Sn| | M1|+---+ +---+ +---+ +---+
|+-- ... --+| |
+---+ +---+| T1| | Tm|+---+ +---+
3
![Page 4: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/4.jpg)
Impacted setup (simplified)
+---+| M |+---+|+---------- .... ----------+--------------+| | |
+---+ +---+ +---+| M1| | Mi| | Mj|+---+ +---+ +---+| | |+-----+- .. -+-----+ +- .. -+ +-----+-- .. -+-----+| | | | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+| S1| | S2| | S.| | Sn| | T.| | Tm| | U1| | U2| | U.| | Uo|+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
4
![Page 5: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/5.jpg)
Upgrade from 5.5 to new major version
+---+| M |+---+|+---------- .... ----------+--------------+| | |
+---+ +---+ +---+| M1| | Mi| | Mj|+---+ +---+ +---+| | |+-----+- .. -+-----+ +- .. -+ +-----+-- .. -+-----+| | | | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+| S1| | S2| | S.| | Sn| | T.| | Tm| | U1| | U2| | U.| | Uo|+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
5
![Page 6: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/6.jpg)
Bad transaction on the master
+---+| M | <<-- “bad transaction”+---+|+---------- .... ----------+--------------+| | |
+---+ +---+ +---+| M1| | Mi| | Mj|+---+ +---+ +---+| | |+-----+- .. -+-----+ +- .. -+ +-----+-- .. -+-----+| | | | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+| S1| | S2| | S.| | Sn| | T.| | Tm| | U1| | U2| | U.| | Uo|+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
6
![Page 7: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/7.jpg)
Oups ! M2 runs OOM and is killed
+---+| M | <<-- “bad transaction”+---+|+---------- .... ----------+--------------+| | |
+---+ +\-/+ +---+| M1| | Mi| | Mj|+---+ +/-\+ +---+| | |+-----+- .. -+-----+ +- .. -+ +-----+-- .. -+-----+| | | | | | | | | |
+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+| S1| | S2| | S.| | Sn| | T.| | Tm| | U1| | U2| | U.| | Uo|+---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+
7
![Page 8: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/8.jpg)
Oups2 ! all “blue” run OOM and are killed
+---+| M | <<-- “bad transaction”+---+|+---------- .... ----------+--------------+| | |
+---+ +\-/+ +---+| M1| | Mi| | Mj|+---+ +/-\+ +---+| | |+-----+- .. -+-----+ +- .. -+ +-----+-- .. -+-----+| | | | | | | | | |
+\-/+ +---+ +\-/+ +---+ +---+ +---+ +---+ +\-/+ +---+ +\-/+| S1| | S2| | S.| | Sn| | T.| | Tm| | U1| | U2| | U.| | Uo|+/-\+ +---+ +/-\+ +---+ +---+ +---+ +---+ +/-\+ +---+ +/-\+
8
![Page 9: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/9.jpg)
What is the “bad” transaction ?
● DELETE FROM TABLE WHERE …lot of rows…;
● Transaction of ~2 GB in the binary logs (RBR)
● Obviously a bug in the application
(but it should not have triggered an OOM)
9
![Page 10: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/10.jpg)
What needs to be done next ?
● Reminder: 5.5 is not replication crash safe
● Next version is crash safe, but can’t…
● Crashed slaves either OOM again or are corrupted
● We need to re-clone all crashed slaves !
10
![Page 11: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/11.jpg)
What saved us ?
● Engaged team of skilled DBAs: all joined to help
● Data not too sensitive on replication delay
● Data not too sensitive on “skipping transactions”
● pt-slave-restart
● IDEMPOTENT mode
● A torrenting cloning tool
11
![Page 12: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/12.jpg)
What could have helped us ?
● A “working” torrenting cloning tool…● Not used often enough, so we did not know it was broken
(fixed in less than 2 hours)
● An AUTO-FIX/AUTO-REPAIR mode (RBR)● Instead of skipping transaction (and make data diverge)
should repair (fix) slave drift (and make data converge)https://bugs.mysql.com/bug.php?id=54250http://blog.wl0.org/2016/05/the-differences-between-idempotent-and-my-suggested-auto-repair-mode/
12
![Page 13: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/13.jpg)
● We are hiring !
● MySQL Engineer / DBA
● System Administrator
● System Engineer
● Site Reliability Engineer
● Developer / Designer
● Technical Team Lead
● Product Owner
● Data Scientist
● And many more…
● https://workingatbooking.com/
Want to know more…
![Page 14: The two little bugs that almost - Percona fileThe two little bugs that almost brought down Booking.com Jean-François Gagné (System Engineer) jeanfrancois DOT gagneAT booking.com](https://reader031.vdocument.in/reader031/viewer/2022021515/5b1c333f7f8b9a2d258f74b9/html5/thumbnails/14.jpg)
Thanks
Jean-François Gagné
jeanfrancois DOT gagne AT booking.com