Download - Monitoring MySQL with OpenTSDB
Monitoring MySQL with OpenTSDBPercona live 2013 Geoffrey Anderson, Box Inc.
@geodbz
WhoGeoffrey Anderson• Database Operations Engineer @ Box, Inc.
• a.k.a. DBA• Tooling for MySQL and HBase• #DBHangOps
TheSituation
ThenYouGetMoreServers
Enter OpenTSDB
OpenTSDB is...
• Distributed• Scalable• Time Series Database• Runs on HBase• Created By
Benoit Sigoure
HBase
TSD for Querying
mydb.example.com
HAProxy
fe1.example.com
TSD for Storing
Push Metrics
Query via API
• FAST• EASY to Scale• EASY to Populate
• EASY to collect data• EASY to Query
Why OpenTSDB?
Collecting Data
#!/usr/bin/env bashtimestamp=$(date +%s) mysql -ss -e "SHOW GLOBAL STATUS" | while read var valdo echo "mysql.$var $timestamp $val host=$HOSTNAME"done
[email protected]:~$ _./mysql_collector.shmysql.Aborted_connects 1366399993 0 host=mydb.example.commysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.commysql.Binlog_cache_use 1366399993 0 host=mydb.example.commysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.commysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.commysql.Bytes_received 1366399993 19453687 host=mydb.example.commysql.Bytes_sent 1366399993 1238166682 host=mydb.example.commysql.Com_admin_commands 1366399993 1 host=mydb.example.commysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com...
Example: mysql_collector.sh
#!/usr/bin/env bashtimestamp=$(date +%s) mysql -ss -e "SHOW GLOBAL STATUS" | while read var valdo echo "mysql.$var $timestamp $val host=$HOSTNAME"done
[email protected]:~$ _./mysql_collector.shmysql.Aborted_connects 1366399993 0 host=mydb.example.commysql.Binlog_cache_disk_use 1366399993 0 host=mydb.example.commysql.Binlog_cache_use 1366399993 0 host=mydb.example.commysql.Binlog_stmt_cache_disk_use 1366399993 0 host=mydb.example.commysql.Binlog_stmt_cache_use 1366399993 0 host=mydb.example.commysql.Bytes_received 1366399993 19453687 host=mydb.example.commysql.Bytes_sent 1366399993 1238166682 host=mydb.example.commysql.Com_admin_commands 1366399993 1 host=mydb.example.commysql.Com_assign_to_keycache 1366399993 0 host=mydb.example.com...
Example: mysql_collector.sh
Metric name Timestamp Value “Tags” (key=val)
* * * * * mysql_collector.sh | nc opentsdb.example.com 4242
Example: adding a cron for OpenTSDB
[email protected]:tcollector$ tree.|-- collectors| |-- 0| | |-- ifstat.py| | |-- iostat.py| | |-- procnettcp.py| | |-- procstats.py| |-- 15| | `-- dfstat.py| |-- 30| | |-- mysql_collector.sh| |-- 300 | | `-- ptTcpModel.sh| `-- etc | |-- config.py|-- config|-- startstop`-- tcollector.py
Run forever
Run every 15 seconds
Run every 5 minutes
Run every 30 seconds
QueryingData
http://opentsdb.example.com/#start=2013/04/10-07:32:29&end=2013/04/10-07:57:57&m=sum:proc.stat.cpu.percentage_idle{host=db22}&o=axis x1y1&m=sum:db.threads_running{host=db22}&o=axis x1y2&ylabel=CPU idle&y2label=Threads Running&yrange=[0:]&wxh=1475x600&png
http://opentsdb.example.com/q?start=2013/04/10-07:32:29&end=2013/04/10-07:57:57&m=sum:proc.stat.cpu.percentage_idle{host=db22}&o=axis x1y1&m=sum:db.threads_running{host=db22}&o=axis x1y2&ylabel=CPU idle&y2label=Threads Running&yrange=[0:]&ascii
Leveraging OpenTSDB For MySQL
user_statistics monitoring
table_statistics monitoring
Table Info from I_S
SELECT *, DATA_LENGTH+INDEX_LENGTH AS TOTAL_LENGTH FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA NOT IN ('PERFORMANCE_SCHEMA','INFORMATION_SCHEMA')
Query Throughput
And other “common” metrics
• Various MySQL status counters• QPS (questions)• Threads connected• Temporary tables on disk• Etc.
• Various server statistics• %CPU Idle• Free disk space• I/O utilization• Network traffic• Etc.
Future collectors
• pt-query-digest/mysqlslow query statistics• Data from “show engine innodb status”• (that is missing from counters)
• PERFORMANCE_SCHEMA (MySQL 5.6+)• Query statistics• Processlist information• Background thread information
How does this change things?
In all seriousness, though...
• Easily see aggregate graphs• Easily build graphs on-the-fly• Full granularity forever• API request for raw data• Cluster-wide nagios checks with check_tsd
Challenges Switching• Aggregates are the default• Mouse-zooming (patched!)• Auto-suggest for metrics• “The graphs aren’t pretty”• Migrating from proof of concept
• Plan for 3+ machines• Data pruning may be required
Some QuickNumbers OpenTSDB @ Box
21,294 metrics 72 tag keys 5,145,745 tag values 90% Interactive graphs
return <300ms
Next Steps
Enjoy #PerconaLive 2013We’re hiring!
https://www.box.com/about-us/careers/[email protected]
Image credits
http://upload.wikimedia.org/wikipedia/commons/7/7b/Batelco_Network_Operations_Centre_(NOC).JPG http://www.flickr.com/photos/hoyvinmayvin/5873697252/ http://www.percona.com/doc/percona-monitoring-plugins http://www.2cto.com/uploadfile/2012/0731/20120731112415744.jpg http://media.tumblr.com/tumblr_lvfspoenWU1qi19a2.png http://img.izismile.com/img/img4/20110527/640/you_can_be_a_superhero_640_01.jpg http://openclipart.org/image/250px/svg_to_png/26427/Anonymous_notebook.png http://images.alphacoders.com/768/2560-1600-76893.jpg http://www.flickr.com/photos/in365/4861180503/ http://openclipart.org/image/250px/svg_to_png/130915/Prohibido_3D.png http://www.flickr.com/photos/61114149@N02/5566484951/ http://opentsdb.net/img/tsd-sample.png http://images2.wikia.nocookie.net/__cb20080911160202/bttf/images/5/57/WhatdidItellyou-HQ.jpg http://www.flickr.com/photos/lisakayaks/3028350539/ http://www.flickr.com/photos/25566302@N00/1472400115 http://www.flickr.com/photos/grandmaitre/5846058698/ http://www.flickr.com/photos/7518432@N06/2673347604/