creating a benchmarking infrastructure that just works

55
1 Creating a Benchmark Infrastructure That Just Works Tim Callaghan VP/Engineering, Tokutek email: [email protected] twitter: @tmcallaghan April 24, 2013

Upload: tim-callaghan

Post on 01-Nov-2014

872 views

Category:

Technology


0 download

DESCRIPTION

This presentation captures the last 20 years I've spent measuring and optimizing the systems I've created. Presented at Percona Live 2013.

TRANSCRIPT

Page 1: Creating a Benchmarking Infrastructure That Just Works

1

Creating a Benchmark Infrastructure

That Just Works

Tim CallaghanVP/Engineering, Tokutekemail: [email protected]: @tmcallaghan

April 24, 2013

Page 2: Creating a Benchmarking Infrastructure That Just Works

2

What is this all about?

• Benchmark “infrastructure” is never done.

• It is a constantly evolving journey.–20+ years for me!

• Hopefully you’ll learn a few new things today.

• Raise your hand.• Share what you learn with

others.

Page 3: Creating a Benchmarking Infrastructure That Just Works

3

Who am I?

• I’m not Mark.• He’s that guy at Facebook.• Sorry if you expected Mark.

Mark

Me

Page 4: Creating a Benchmarking Infrastructure That Just Works

4

Who am I?

“Mark Callaghan’s lesser-known but nonetheless smart brother.”

[C. Monash, May 2010]

www.dbms2.com/2010/05/25/voltdb-finally-launches

Page 5: Creating a Benchmarking Infrastructure That Just Works

5

Who am I? Seriously.

My Background• Internal development (Oracle), 92-99 • SaaS development (Oracle), 99-09 • Field engineering (VoltDB), 09-11 • VP Engineering (Tokutek), 11-present

Meaning• development, administration, management,

infrastructure, testing, product management, support

Page 6: Creating a Benchmarking Infrastructure That Just Works

6

What is benchmarking?

Page 7: Creating a Benchmarking Infrastructure That Just Works

7

What is benchmarking?

Wikipedia“Benchmarking is used to measure performance using a specific indicator ([transactions per second, transactional latency, etc]) resulting in a metric of performance that is then compared to others.”

My Interpretation“Run something while taking a lot of measurements, make changes, rerun and compare.”

Page 8: Creating a Benchmarking Infrastructure That Just Works

8

Generic Benchmarking Process

• Prepare for benchmark– Hardware, operating system, software

• Run Benchmark– Record data as you go

• Save Results– Flat files, database tables

• Cleanup– Leave things as you found them

• Analyze results– Look for positive and negative differences

• Change something– 1 thing at a time if possible

• Repeat– A lot

Page 9: Creating a Benchmarking Infrastructure That Just Works

9

My first benchmark

Commodore-64 (1985)• Programming graphics demos• Needed to make sprites “bounce”

• Timed round-trips using my stopwatch• Using COS() worked, but was slow• Found that a lookup table was much faster–COS(0)=1, COS(1)=.9998, …

• A benchmarker was born!

Page 10: Creating a Benchmarking Infrastructure That Just Works

10

Now I’m an Addict (28 years later)

• TokuDB v5.2 – 01/2012– End of release (code-freeze)– 3 benchmarks– 1 platform (Centos5), Disk only– By hand: start mysql, run benchmark, graph results

• TokuDB v6.6 – 12/2012– Each major development effort– 9 benchmarks– 2 platforms (Centos5 and Centos6), Disk and SSD– Automated: single script produces full comparison

• Soon–Nightly– 9+ benchmarks

oLinkbench and many other new ideas

Page 11: Creating a Benchmarking Infrastructure That Just Works

11

Important benchmarking skills

• Infrastructure–Hardware–Operating Systems

• Software Development–Benchmark applications are applications–Bash scripting

• Communications–Displaying results, finding trends–Blogging

• You are only as good as your worst skill–They all matter

Page 12: Creating a Benchmarking Infrastructure That Just Works

12

Why is frequent benchmarking important?

• One improvement can mask one drop

Page 13: Creating a Benchmarking Infrastructure That Just Works

13

Why is frequent benchmarking important?

• Find issues while there is still time to deal with them–Avoid creating a “fix last release’s performance

regression” ticket

Page 14: Creating a Benchmarking Infrastructure That Just Works

14

Why is frequent benchmarking important?

• Developers <3 data– Satisfaction of immediate feedback for efforts– (or) find out things are worse

Page 15: Creating a Benchmarking Infrastructure That Just Works

15

Level : Beginner

Page 16: Creating a Benchmarking Infrastructure That Just Works

16

Basic configuration

• Start with a “known good” environment– Learn from what others have shared about server and MySQL

configuration, many online resources exist– Vadim Tkachenko @ Percona,

http://www.mysqlperformanceblog.com/author/vadim/– Dimitri Kravtchuk @ Oracle, http://dimitrik.free.fr/blog/– Too many to list them all.– I always create an extra file system where benchmarking

occursoTreat it as if it will not be there tomorrowoEnables rebuilding/changing file systems without losing

important work

Page 17: Creating a Benchmarking Infrastructure That Just Works

17

Benchmarking your equipment

• Quick tests to ensure configuration is correct

• If possible, compare to existing equipment

• Sysbench CPU– sysbench --test=cpu --cpu-max-prime=20000 run

• Sysbench memory– sysbench --test=memory --memory-block-size=1K --memory-scope=global --memory-total-size=100G --memory-

oper=read run

– sysbench --test=memory --memory-block-size=1K --memory-scope=global --memory-total-size=100G --memory-oper=write run

• Sysbench IO– sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --file-num=1 prepare

– sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --file-extra-flags=direct --file-num=1 --max-requests=50000 run

– sysbench --test=fileio --file-total-size=10G --file-test-mode=rndrw --file-num=1 cleanup

• Sysbench OLTP– sysbench –test=oltp -oltp_tables_count=16 --oltp-table-size=500000 prepare

– sysbench –test=oltp -oltp_tables_count=16 --oltp-table-size=500000 prepare

Page 18: Creating a Benchmarking Infrastructure That Just Works

18

What should I measure?

• Start with throughput and latency– Throughput = the number of operations per specific time

interval, often referred to as transactions per second (TPS)– Latency = the amount of time required to satisfy a single

transaction

• Use Intervals– Benchmarks generally run for several minutes or hours,

throughput and latency should be captured in smaller windows (minutes or seconds).

– Interval data provides the information needed to compute cumulative averages, min, max…

Page 19: Creating a Benchmarking Infrastructure That Just Works

19

Where do you go with questions?

• Ask the benchmark owner

– sysbench = Percona (launchpad)– Iibench = Mark Callaghan (launchpad)– tpc-c = Percona (launchchpad), many other

implementations exist• www.stackoverflow.com / www.serverfault.com

Page 20: Creating a Benchmarking Infrastructure That Just Works

20

The power of 2 (but buy 3 if you can)

• Historical results are critical– (but sometimes you just need to start over)

oEquipment dies or becomes irrelevant

Page 21: Creating a Benchmarking Infrastructure That Just Works

21

The power of 3 (for a while)

• Run 3 times until you regularly produce consistent results

Page 22: Creating a Benchmarking Infrastructure That Just Works

22

Visualization, for now

• Use Excel or LibreOffice

Page 23: Creating a Benchmarking Infrastructure That Just Works

23

Hacks

• Low tech solutions are fine– Franken-switch – partition detection benchmarking

Find hardware

Buy itInstall it

Configure itRun

benchmarks

Vs.

Page 24: Creating a Benchmarking Infrastructure That Just Works

24

Ready-to-go MySQL

• Evolutionary process – I’ve probably done each > 1000 times– Phase 1 : before each benchmark manually create MySQL and start

o untar MySQLo copy my.cnf from known good directoryo bin/mysql_install_dbo bin/mysqld_safe

– Phase 2 : for each new build, unpack and repack MySQL (ready-to-go)o in empty folder : copy my.cnf, fresh-db, and MySQL tar

o ./fresh-db (untars MySQL, copies my.cnf, runs install, starts/stops mysql)o rm –rf mysql-test sql-bencho tar xzvf $ARCHIVE_DIR/blank-toku610-mysql-5.5.28.tar.gz .

– Phase 3 : for each new build, unpack and repack MySQL (ready-to-go)o in empty folder : copy MySQL taro gf blank-toku610-mysql-5.5.28.tar.gz (bash script that does everything else)

– Phase 4 : at some pointo in empty folder : copy MySQL taro gf2 (bash script that does everything else, including detecting version of TokuDB and

MySQL/MariaDB)

Page 25: Creating a Benchmarking Infrastructure That Just Works

25

Useful technologies

• rsync– Synchronize your files with benchmark servers

• nfs– (If you can) make all of your files available to your benchmark servers

without needing to manually rsync

• dropbox – Easy way to synchronize your files with EC2 benchmark servers– (thanks for Amrith Kumar @ ParElastic for this tip)

Page 26: Creating a Benchmarking Infrastructure That Just Works

26

Exit throughput?

• Allow your benchmark to stabilize

• Average final <x> transactions or <y> minutes

Page 27: Creating a Benchmarking Infrastructure That Just Works

27

Trust no one

• If possible, dedicated benchmarking equipment

• Prior to starting, check for clean environment (users, processes)– users– ps aux | grep mysqld– ls /tmp/*.sock

• Periodically check for the duration of benchmark run

• Leave it as you found it– Shutdown MySQL– Kill any stray monitoring scripts– Delete all MySQL files

Page 28: Creating a Benchmarking Infrastructure That Just Works

28

Level : Intermediate

Page 29: Creating a Benchmarking Infrastructure That Just Works

29

Environment variables

• I find it convenient for my benchmarking scripts to pass information using environment variables– Command line arguments can be used as well

• Create a commonly named set of environment variables for all machines.

• Load it on login (add to .bashrc)– source ~/machine.config

export machine_name=mork

export tokudb_cache=24G

export innodb_cache=24G

export storage_type=disk

export benchmark_directory=/data/benchmark

Page 30: Creating a Benchmarking Infrastructure That Just Works

30

Scripting 1

• Lets do some bash, simple Sysbench runner

#! /bin/bash

num_tables=16

sysbench –test=oltp num_tables=${num_tables} > results.txt

tar ${benchmark_results_dir}/${machine_name}.tar results.txt

Page 31: Creating a Benchmarking Infrastructure That Just Works

31

Scripting 2

• Make sure an environment variable is defined

#! /bin/bash

if [ -z "$machine_name" ]; then

echo "Need to set machine_name"

exit 1

fi

num_tables=16

sysbench –test=oltp num_tables=${num_tables} > results.txt

tar ${benchmark_results_dir}/${machine_name}.tar results.txt

Page 32: Creating a Benchmarking Infrastructure That Just Works

32

Scripting 3

• Make sure a directory exists

#! /bin/bash

if [ ! -d "$benchmark_results_dir" ]; then

echo "Need to create directory $benchmark_results_dir"

exit 1

fi

num_tables=16

sysbench –test=oltp num_tables=${num_tables} > results.txt

tar ${benchmark_results_dir}/${machine_name}.tar results.txt

Page 33: Creating a Benchmarking Infrastructure That Just Works

33

Scripting 4

• Make sure a directory is empty

#! /bin/bash

if [ "$(ls -A $benchmark_results_dir)" ]; then

echo "$benchmark_results_dir has files, cannot run script”

exit 1

fi

num_tables=16

sysbench –test=oltp num_tables=${num_tables} > results.txt

tar ${benchmark_results_dir}/${machine_name}.tar results.txt

Page 34: Creating a Benchmarking Infrastructure That Just Works

34

Scripting 5

• Default values

#! /bin/bash

if [ -z "$num_tables" ]; then

export num_tables=16

fi

sysbench –test=oltp num_tables=${num_tables} > results.txt

tar ${benchmark_results_dir}/${machine_name}.tar results.txt

Page 35: Creating a Benchmarking Infrastructure That Just Works

35

Scripting 6 – scripts calling scripts

controller.bash#! /bin/bash

if [ -z "$mysql" ]; then

export mysql=blank-toku701-mysql-5.5.30

fi

if [ -z "$num_tables" ]; then

export num_tables=16

fi

for numThreads in 1 2 4 8 16 32 64; do

export clientCount=${numThreads}

./runner.bash

done

runner.bash#! /bin/bash

if [ -z "$mysql" ]; then

echo "Need to set mysql”; exit 1

fi

if [ -z "$num_tables" ]; then

export num_tables=16

fi

if [ -z ”$clientCount" ]; then

export clientCount=64

fi

mkdb ${mysql} #unpack and start ready-to-go-mysql

sysbench –test=oltp num_tables=${num_tables} –num_threads=${clientCount} > results.txt

tar ${benchmark_results_dir}/${machine_name}-${clientCount}.tar results.txt

mstop #shutdown mysql and cleanup

Page 36: Creating a Benchmarking Infrastructure That Just Works

36

Scripting 7 – scripts calling scripts calling scripts

multibench.bash

#! /bin/bash

for mysqlType in mysql mariadb ; do

export mysql=blank-toku701-${mysqlType}-5.5.30

sysbench-directory/controller.bash

iibench-directory/controller.bash

for numWarehouses in 100 1000 ; do

export num_warehouses=${numWarehouses}

tpcc-directory/controller.bash

done

done

Page 37: Creating a Benchmarking Infrastructure That Just Works

37

Add more measurements

• IO (iostat)

• CPU/Memory (ps)

• MySQL engine status– “show engine <storage-engine-name> status;”

• MySQL global status– “select * from information_schema.global_status;”

• You are only measuring too much when it affects the benchmark

• You never know when it will be helpful

• De-dupe–Many counters are the same at each interval, save space by

only capturing values that changed since the last measurement

Page 38: Creating a Benchmarking Infrastructure That Just Works

38

Optimize everything not benchmarked

• Spend as little time as possible on surrounding steps

• Example : load Sysbench data, 16 tables, 50 million rows per table– trickle load

oclient application generating insert statementso5+ hours to load

– bulk loadoafter trickle loading, mysqldump each table to TSVo“load data infile …”o70 minutes to load

– preloaded databaseosave MySQL data folder after bulk loadingo“cp –r” prior to running benchmarko10 minutes to load

Page 39: Creating a Benchmarking Infrastructure That Just Works

39

Anomalies

• Graph results, review graphs regularly– Huge dip in the middle of the benchmark– Exit throughput was unaffected

Page 40: Creating a Benchmarking Infrastructure That Just Works

40

Don’t automate too early

• You’ll end up automating the wrong things

• You’ll over-engineer your automation

• Never automate “until it hurts”

Page 41: Creating a Benchmarking Infrastructure That Just Works

41

Put results into a database

• Create a simple schema for historical results

• benchmark_header– bench_id– name (sysbench, tpcc, iibench)– duration (# seconds | # rows)– thruput (use exit throughput)– attr1 (# sysbench rows | # tpc-c warehouses | # iibench rows)– attr2 (# sysbench tables)– clients (concurrency)– machine_name (server)

• benchmark_detail– bench_id– duration (# seconds | # rows)– thruput (interval)– latency (interval)– throughput_avg (cumulative)

• create script to read benchmark logs and load database

• run immediately as benchmark ends

Page 42: Creating a Benchmarking Infrastructure That Just Works

42

Pre-compute interesting numbers

• Pre-compute interesting numbers

• SQL to compute running average– extremely slow on benchmarks with 10,000+ data points– add calculation of running average to script that reads/loads results

select dtl1.duration as duration,

dtl1.thruput as interval_tps,

avg(dtl2.thruput) as avg_tps

from benchmark_detail dtl1,

benchmark_detail dtl2

where dtl1.bench_id = 1 and

dtl2.bench_id = dtl1.bench_id and

dtl2.duration <= dtl1.duration

group by dtl1.duration, dtl1.thruput;

Page 43: Creating a Benchmarking Infrastructure That Just Works

43

Visualization - gnuplot

• I’ve been graphing benchmark results in Excel for years

• Last year someone asked me why don’t just learn gnuplot

• OMG!

• If you don't use it, start using it, today.

set terminal pngcairo size 800,600

set xlabel "Inserted Rows”

set ylabel "Inserts/Second”

set title "iiBench Performance - Old vs. New"

set output ”benchmark.png”

plot "./old.txt" using 1:6 with lines ti "Old", "./new.txt" using 1:6 with lines ti "New"

Page 44: Creating a Benchmarking Infrastructure That Just Works

44

Idle machines could be benchmarking

• Obvious benchmarks– increase cache size– buffered vs. direct IO– RAID levels– File systems– rotating disk vs. ssd vs. flash– InnoDB : search the web for ideas – TokuDB : compression type, basement node size

• Not so obvious– decrease cache size– alternative memory allocators

Page 45: Creating a Benchmarking Infrastructure That Just Works

45

Level : Advanced

Page 46: Creating a Benchmarking Infrastructure That Just Works

46

Level : Advanced

Everything that follows is on my

wish list...

Page 47: Creating a Benchmarking Infrastructure That Just Works

47

Additional data capture

• Find a data capture tool that gathers more server metrics– collectd

• Graph and review the data–make it actionable

• Another gnuplot moment?

Page 48: Creating a Benchmarking Infrastructure That Just Works

48

Continuous Integration

• Implement CI server (Jenkins?)

–Nightly build–Distribute and execute benchmarks across group of

servers–Graph results–Compare to prior day–Alert if "threshold" change from previous day, or over

previous period

Page 49: Creating a Benchmarking Infrastructure That Just Works

49

Self-service

MySQL Version

TokuDB Version

Storage Engine

Benchmark

Num Tables

Num Rows

Compare To

Page 50: Creating a Benchmarking Infrastructure That Just Works

50

Automated anomaly detection

• Problem is easy to see, not always easy to detect

Page 51: Creating a Benchmarking Infrastructure That Just Works

51

PSA

"Many benchmarks are like magic tricks. When you know how the results were achieved, you are no longer impressed.”

twitter.com/GregRahn/status/263771033583104000

Show the whole picture

Page 52: Creating a Benchmarking Infrastructure That Just Works

52

Wrapping it up

Dive In.

Benchmark Everything.

Learn and Share.

Page 53: Creating a Benchmarking Infrastructure That Just Works

53

Please rate this session.

Feedback is important!

Page 54: Creating a Benchmarking Infrastructure That Just Works

54

Tokutek is hiring!

Position: QA++

TestingSupport

BenchmarkingRelease Engineering

Page 55: Creating a Benchmarking Infrastructure That Just Works

55

Questions?

Please rate this session.

Tim [email protected] (email)@tmcallaghan (twitter)

www.tokutek.com/tokuview (blog)