lessons learned from my experience during the adoption, use and growth of mysql at aol kevin pettit...

56

Upload: nayef

Post on 25-Feb-2016

32 views

Category:

Documents


1 download

DESCRIPTION

Lessons learned from my experience during the adoption, use and growth of MySQL at AOL Kevin Pettit Principle DBA Content Application Operations . MySQL Origins for AOL Summer 2003. - PowerPoint PPT Presentation

TRANSCRIPT

MySQL @ Aol.

1

Lessons learned from my experience during the adoption, use and growth of MySQL at AOL

Kevin Pettit

Principle DBA

Content Application Operations

2

MySQL Origins for AOLSummer 2003

Internet Business Solutions, Partner Hosting, Acquisitions and the adoption of open source products gave rise to a new kid in our Database world: MySQL

3

MySQL First Projects

2003

2004 2005 2006

4

MySQL First Steps & Tools

Company was moving to Linux from HP-UX & SUN.

Met with other DBAs on my team and created/documented install standards.

First instances were MySQL versions 3.23 and 4.0 on a 32 bit OS. 2 GB File system size limitations.

First scripts/tools were nearly all shell scripts. Simple scripts for Backups, Log file management, and Replication monitoring.

5

MySQL First Observations

MySQL seemed to lack utilities for lots of basic functions/needs

Backups were just ascii text files of DDL & Insert statements.

Alter tables (column, index) locked/blocked table!

Few Tuning or Performance Analysis tools.

MySQL slow & query (general) logs were intriguing.

6

David Axmark & Monty Widenius Jay Pipes

Brian Aker Kaj Arno

7

Netscape ISP MySQL Architecture

8

8

MySQL First Projects

2003

2004 2005 2006

9

10

11

11

12

GoldRush DB Architecture

Click to edit Master text styles

Second level

Third level

Fourth level

Fifth level

12

First 1000 MySQL Instances

13

MySQL Growth @ AOL

14

MySQL First Tools

First tool was a simple korn shell script (mydbmgr.ksh)for backup/log file management:

Renamed existing log files and ran a flush logs command to rotate log files (Binary, update, error)

Archived (tar) the Inactive log files (Binary, update, error)

Remove old backup & archived log files

For Each database:

Backup Each database to a separate file

Log Each database backup to a table

Compress each database Backup file

15

MySQL Query Rates for Netscape ISP

16

16

17

MySQL First Lessons

Scaling Reads is easy with MySQL Replication.

Be conservative about adopting major versions and with your MySQL upgrade schedule.

MyISAM tables did not scale well for high concurrency workloads or when we had a high rate of queries creating temp tables. (Table Level Locking)

Too many read slaves on master

Each Slave added load to master (reading binary log). This could impact Master performance.

18

MySQL First Lessons

InnoDB Buffer Pool size & innodb_log_file size are most important configuration variables for tuning InnoDB performance.

Tracking replication information with performance stats provides valuable clues for forensic sleuthing of What caused that.

Use MySQL General log helpful to improve understanding application interactions with the database.

Use MySQL Explain & Slow Query log to tune queries & identify index needs.

19

Query Analysis

Using the MySQL general log, an open source utility (mysqlreport http://hackmysql.com/mysqlreport), and a simple perl script (mysqlqa.pl), I created what I called a Query Analysis:

Steps to do a Query Analysis:

Gather 1-n Hours of general log data.

Use mysqlreport to generate a plain SQL text file of queries from a MySQL general log.

Use mysqlqa.pl perl script to read the queries and create a generic version of the query. EXPLAIN is ran against the first occurrence of each new query.

20

20

Query Analysis

21

22

2004-2005: Leveraged general log, mysqlreport & custom perl script.

2007 -Today: Leverage Slow Query Log (with long_query_time=0;) & Percona Toolkits pt-query-digest

mk-query-digest/pt-query-digest & Slow Query Log (with long_query_time=0) uses actual execution timings/internal counters so performance stats are much more accurate!

Check top 10-25 most expensive, frequent queries for optimization opportunities (Better indexes & alternate query design).

Query Analysis Evolution

23

MySQL Needs (Early Years)

With so many MySQL projects, need for displaying the replication tree became very important for NOC & On-Call DBA

24

MySQL Growth & Matrix Project

Goals for Operations Infrastructure Project (Matrix)

Able to rapidly deploy new projects and services

Able to quickly scale up and down the hardware capacity to meet the production traffic requirements

Able to support leading web 2.0 development platforms and technologies

Able to rapidly provision development and staging environments for product development teams

Able to manage costs and improve efficiencies

25

Project Matrix Solution

A server farm, installed and pre-configured with all the leading web 2.0 platforms

LAMP Platform - Linux, Apache, MySQL, PHP

Standard Blogging Platform Wordpress

Standard Content Mgmt Platform Drupal

Built in redundancy and scalability

220 servers deployed in 2 Data Centers Dulles and Manassas

Configured with Netscaler VIPs

Rapid provisioning and scalability of projects at multiple tiers

26

26

Project Matrix Solution

Database servers are setup with replication to enable quick scalability at multiple tiers

23 Project Slots with 5 levels of capacity

Tier 1 64 Read slaves

Tier 2 32 Read Slaves

Tier 3 16 Read Slaves

Tier 4 8 Read Slaves

Tier 5 perl mk-query-digest hostA.slowlog.20120120000001

# 43.5s user time, 150ms system time, 31.64M rss, 99.14M vsz

# Current date: Tues Jan 24 9:44:39 2012

# Files: hostA.slowlog.20120120000001

# Overall: 159.72k total, 10 unique, 1.85 QPS, 47.94x concurrency ________

# total min max avg 95% stddev median

# Exec time 4142444s 2s 253s 26s 113s 36s 11s

# Lock time 315s 0 31s 2ms 0 177ms 0

# Rows sent 9.86M 0 100 64.75 97.36 44.78 97.36

# Rows exam 59.37G 24 1.31M 389.78k 1.26M 472.79k 165.97k

# Time range 2012-01-19 00:01:40 to 2012-01-20 00:01:43

# bytes 79.67M 161 3.47k 523.03 621.67 187.51 592.07

# Profile

# Rank Query ID Response time Calls R/Call Item

# ==== ================== ================== ===== ======== ==============

# 1 0xD3498568EB9790DF 2852265.0000 68.9% 33743 84.5291 SELECT view_counts_tbl

# 2 0x98FFDF7DF9AB29F6 460973.0000 11.1% 38630 11.9330 SELECT aggregate_counts_tbl

# 3 0x023560745F86665A 198111.0000 4.8% 15372 12.8878 SELECT aggregate_counts_tbl

# 4 0x8ED5FEB4CEB2FFCE 193445.0000 4.7% 15098 12.8126 SELECT aggregate_counts_tbl

# 5 0x1C59BAE3E36C0AF4 156430.0000 3.8% 19835 7.8866 SELECT aggregate_counts_tbl

# 6 0x2CA0DFA413086FA2 135065.0000 3.3% 14397 9.3815 SELECT aggregate_counts_tbl

# MISC 0xMISC 146155.0000 3.5% 22641 6.4553

38

Tuning Lessons Details

# Query 1: 0.39 QPS, 33.02x concurrency, ID 0xD3498568EB9790DF at byte 78970271

# pct total min max avg 95% stddev median

# Count 21 33743

# Exec time 68 2852265s 4s 253s 85s 151s 43s 80s

# Lock time 77 243s 0 31s 7ms 0 372ms 0

# Rows sent 16 1.64M 3 100 51.11 97.36 47.23 2.90

# Rows exam 72 43.26G 1.31M 1.31M 1.31M 1.26M 0 1.26M

# Users 1 user_r

# Hosts 1 localhost

# Databases 1 statsdb

# Time range 2012-01-19 00:01:41 to 2012-01-20 00:01:31

# bytes 6 5.21M 161 163 161.99 158.58 0 158.58

# Query_time distribution

# 1ms

# 10ms

# 100ms

# 1s ###

# 10s+ ################################################################

# Tables

# SHOW TABLE STATUS FROM `statsdb` LIKE 'view_counts_tbl'\G

# SHOW CREATE TABLE `statsdb`.`view_counts_tbl`\G

# EXPLAIN

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies'AND Type = 'movie'AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

39

Tuning Lessons Details

CREATE TABLE `view_counts_tbl` (

`AggregateViewID` int(10) unsigned NOT NULL AUTO_INCREMENT,

`AssetURI` varchar(255) NOT NULL,

`Application` varchar(30) NOT NULL DEFAULT '',

`Category` varchar(25) DEFAULT '',

`Type` varchar(25) DEFAULT '',

`App1` varchar(64) DEFAULT '',

`App2` varchar(64) DEFAULT '',

`App3` varchar(64) DEFAULT '',

`UpdateDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

`ViewCountsDay` int(10) unsigned DEFAULT '0',

`ViewCountsWeek` int(10) unsigned DEFAULT '0',

`ViewCountsMonth` int(10) unsigned DEFAULT '0',

`ViewCountsAll` int(10) unsigned DEFAULT '0',

PRIMARY KEY (`AggregateViewID`),

UNIQUE KEY `index` (`AssetURI`,`Application`),

KEY `view_idx1` (`ViewCountsDay`),

) ENGINE=InnoDB DEFAULT CHARSET=utf8;

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies AND Type = 'movie AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

40

Tuning Lessons Details

Before Adding Index:

mysql> EXPLAIN select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: view_counts_tbl

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 1391165

Extra: Using where; Using filesort

------------------------ Slow Query Log Example Run ----------------------------

# Time: 120124 9:42:04

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 12.35962 Lock_time: 0.000065 Rows_sent: 100 Rows_examined: 1701886

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

41

Tuning Lessons Details

After Adding Index:

alter table view_counts_tbl add KEY apptype(Application,Type,App1);

mysql> EXPLAIN select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: view_counts_tbl

type: ref

possible_keys: apptype

key: apptype

key_len: 365

ref: const,const,const

rows: 484390

Extra: Using where; Using filesort

------------------------ Slow Query Log Example Run ----------------------------

# Time: 120124 9:44:47

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 1.969384 Lock_time: 0.000074 Rows_sent: 100 Rows_examined: 650624

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

42

Tuning Lessons Details

After Adding Covering Index:

alter table view_counts_tbl add KEY apptype(Application,Type,App1,ViewCountsDay);

mysql> EXPLAIN select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: view_counts_tbl

type: ref

possible_keys: apptype

key: apptype

key_len: 365

ref: const,const,const

rows: 275600

Extra: Using where

------------------------ Slow Query Log Example Run ----------------------------

# Time: 120124 9:55:37

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 1.123939 Lock_time: 0.000064 Rows_sent: 100 Rows_examined: 325312

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

43

Tuning Lessons Details

After Removing SQL_CALC_FOUND_ROWS from the query:

mysql> EXPLAIN select * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: view_counts_tbl

type: ref

possible_keys: apptype

key: apptype

key_len: 365

ref: const,const,const

rows: 275600

Extra: Using where

------------------------ Slow Query Log Example Run ----------------------------

# Time: 120124 12:55:37

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 0.001075 Lock_time: 0.000056 Rows_sent: 100 Rows_examined: 100

select * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

44

Tuning Lessons Details

----------------------- Slow Query Log Example Run ----------------------------

# Time: 120124 12:55:37

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 0.001075 Lock_time: 0.000056 Rows_sent: 100 Rows_examined: 100

select * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

# User@Host: user[user] @ [xxx.xxx.xxx.xxx]

# Query_time: 0.815276 Lock_time: 0.000069 Rows_sent: 100 Rows_examined: 325312

select SQL_CALC_FOUND_ROWS * FROM view_counts_tbl WHERE Application = 'movies' AND Type = 'movie' AND App1 = 'movie' ORDER BY ViewCountsDay DESC LIMIT 0,100;

After running the query, Run the command to get the output of SQL_CALC_FOUND_ROWS:

mysql> SELECT FOUND_ROWS();

+--------------+

| FOUND_ROWS() |

+--------------+

| 325312 |

+--------------+

45

Tuning Lessons Details

46

Tuning Lessons Details

47

Tuning Lessons Details

48

Current Tools/Tech

MySQL 5.0, MySQL 5.1, MySQL 5.5,Percona Server 5.1

Maatkit utilities.

MySQL LVM Backups, Percona XtraBackup Hot Backups, Zmanda Backups.

Nagios & custom scripts for monitoring.

Replication Topology Tool: Integration with Netscaler VIP data.

49

Current Tools/Tech

50

51

52

Lessons Learned Summary

General:

Collect MySQL Statistics & Binlog/Replication data.

Monitor/Review for significant changes to statistics

Use statistics to monitor & notify you of growth concerns.

Leverage MySQL Slow query Log & Percona Toolkit utilities to identify problem queries.

Large Scale:

Automate Installation process.

Automate Routine needs.

53

Future Plans

Move to Percona Toolkit.

MySQL as an AOL Cloud service.

Explore/Evaluate:

Evaluate add-on Galera Replication & Percona XtraDB Cluster.

Automated Stats/Analysis Monitoring.

Create wrapper utilities to leverage Xtrabackups backups for automatic setup of new MySQL slaves.

Integrate/combine utility scripts to automate scale up & scale down on VMs.

54

Questions?

55

56