kiz university of ulm - · pdf filedtrace buffer stack() and ustack

88
OSDevCon Berlin March 1 st 2007 Page 1 Looking Into The Black-Box- how the kernel may impact your application © 2007, Thomas Nau, kiz, Universty of Ulm Thomas Nau [email protected] kiz University of Ulm

Upload: phamtu

Post on 07-Feb-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 1

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Thomas [email protected]

kizUniversity of Ulm

Page 2: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 2

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Some scaring beforewe get started ...

Page 3: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 3

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Know your enemy and know yourself; in a hundred battles, you will never be defeated.

Know your systems and your tools; for a hundred problems, you will always know what to do.

(Sun Tzu, about 2500 years ago)

(adapted for today's IT centric world)

Page 4: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 4

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● truss(1), apptrace(1) and mdb(1)– not dynamic; problems with timing or transient errors

– probing affects the target● truss stops the process, gathers data and then continues the

process

– hard to combine data from different processes such as dtlogin session

– hard to cross userland-kernel boundary with one tool

Page 5: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 5

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● intrstat(1m)– gathers and displays run-time interrupt statistics per

device and processor or processor set

obi-wan# intrstat 5

device | cpu0 %tim cpu1 %tim cpu2 %tim cpu3 %tim---------+----------------------------------------------------e1000g#0 | 0 0.0 3899 7.6 0 0.0 0 0.0e1000g#1 | 4067 8.1 0 0.0 0 0.0 0 0.0 uhci#0 | 1 0.0 0 0.0 0 0.0 0 0.0 uhci#1 | 1 0.0 0 0.0 0 0.0 0 0.0^C

# ofinterrupts

percentage of absolute timespent in IRQ handler

Page 6: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 6

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● busstat(1m)– report bus-related performance statistic

– hardware support required (output depends on system)

yedi# busstat -w dram0,pic0=bank_busy_stalls,pic1=mem_read_write 2

time dev event0 pic0 event1 pic12 dram0 bank_busy_stalls 193307539 mem_read_write 714007644 dram0 bank_busy_stalls 192952912 mem_read_write 713389046 dram0 bank_busy_stalls 194118606 mem_read_write 718666598 dram0 bank_busy_stalls 194180462 mem_read_write 7182210810 dram0 bank_busy_stalls 129238477 mem_read_write 4990866912 dram0 bank_busy_stalls 1029510 mem_read_write 64161014 dram0 bank_busy_stalls 10229 mem_read_write 11562

Page 7: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 7

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● cputrack(1m), cpustat(1m)– monitor system- and/or application performance

using CPU hardware counters

yedi# cputrack -v -t -c pic0=DTLB_miss,pic1=Instr_cnt -p 24450

time lwp event %tick pic0 pic1 0.008 1 init_lwp 0 0 0 1.014 1 tick 21508400 2583 5222045 2.014 1 tick 3477900 624 595049 3.014 1 tick 72532 0 0^C

Page 8: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 8

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● trapstat(1m)– reports trap statistic per processor or processor set

– most useful for TLB related events as they can have a severe impact on performance● compare large TLB caches of UltraSPARC-IIIi versus the tiny

ones found in new UltraSPARC-T1 based T2000 systems● had significant bad effect on our web-mail application before

we tuned it using large pages

Page 9: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 9

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

yedi# trapstat -c 4 -T

cpu m size| itlb-miss %tim ... | dtlb-miss %tim dtsb-miss %tim |%tim----------+--------------------+-------------------------------+---- 4 u 8k| 70 0.0 ... | 1031 0.0 2 0.0 | 0.0 4 u 64k| 130 0.0 ... | 898 0.0 13 0.0 | 0.0 4 u 4m| 0 0.0 ... | 1 0.0 0 0.0 | 0.0 4 u 256m| 0 0.0 ... | 0 0.0 0 0.0 | 0.0- - - - - + - - - - - - - -- - + - - - - - - - - - - - - - - - + - - 4 k 8k| 172 0.0 ... | 39162 1.8 28 0.0 | 1.8 4 k 64k| 0 0.0 ... | 0 0.0 0 0.0 | 0.0 4 k 4m| 0 0.0 ... | 5785 0.3 69 0.0 | 0.3 4 k 256m| 0 0.0 ... | 0 0.0 0 0.0 | 0.0==========+====================+===============================+==== ttl | 372 0.0 ... | 46877 2.2 112 0.0 | 2.2

Page 10: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 10

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The “Old” Way

● the *stat(1m) commands are very helpful but generally provide top-level views only

● most available 3rd party tools are userland centric● hint: user_attr(4) can be used to grant non-root

users access to hardware performance counters nau::::defaultpriv=basic,cpc_cpu

Page 11: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 11

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Preparing For A New Era

● solution to the mentioned problems– dynamically modifying a system in a safe way to

record arbitrary data

– replace sampling by triggers

● in other words ...

Page 12: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 12

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace

● dynamic tracing facility– allows to dynamically and efficiently instrument kernel

and user-level code

– 40,000+ probes distributed over kernel and modules;can be enabled independently with almost no overhead

– C-like scripting, interpreted in kernel context (no loops)

– tools already built on top: plockstat(1m), er_kernel(1), ...

– access based on Least Privileges; you may grant user dtrace_{proc, user, kernel} privileges

Page 13: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 13

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

s1.d s2.d

lockstat(1m)

dtrace(1m) intrstat(1m)

plockstat(1m)

libdtrace(3lib)

dtrace(7d)

syscall sysinfo vminfo fbt sched io

DTraceproviders

DTraceconsumers

DTracecode

kerneluserland

...

Page 14: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 14

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Probes

● are locations or activities which can trigger almost arbitrary actions– record user/kernel stacks

– printout data structures

– even manipulate data

● identified by provider, module, function and name – sched:unix:resume:off-cpu

Page 15: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 15

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Build-In Probes

● dtrace:::{BEGIN,END,ERROR}– trigger when DTrace script starts, finishes or when an

error occurs

obi-wan# dtrace -n 'dtrace:::BEGIN { trace("Here We Go"); } \ dtrace:::END { trace("We Are Done"); }'

dtrace: description 'dtrace:::BEGIN ' matched 2 probesCPU ID FUNCTION:NAME 2 1 :BEGIN Here We Go^C 2 2 :END We Are Done

obi-wan#

Page 16: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 16

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Providers

● DTrace probes are implemented by so named providers

● each of them performs a particular type of probing– syscall: offers probes for every entry and return point

of all system calls

– fbt: implements probes for entering and leaving kernel functions

● some providers create probes on-the-fly● you may also write your own

Page 17: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 17

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

“D-Language” Structure

● no loops or similar control statements for safety reasons as DTrace code is executed in kernel context (just think about an endless loop)

● very simple general form executed top to bottom;predicates are optional

probe description/ predicate /{ actions}

Page 18: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 18

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: System Call Statistics

#!/usr/sbin/dtrace -s

/* use syscall provider and count() aggregation * to create system call 'call-statistics' */

syscall:::entry/ execname != "dtrace" /{ @c[execname, probefunc] = count();}

Page 19: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 19

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: System Call Statistics

obi-wan# ./scstat.d^C...Xsun writev 23Xsun pollsys 27gnome-terminal pollsys 28Xsun read 30gnome-terminal ioctl 35rpcbind lwp_sigmask 36ypserv lwp_sigmask 36rpcbind fstat 45rpcbind ioctl 54java pollsys 55Xvnc pollsys 162

Page 20: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 20

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Script Language “D”

● “C” like look and feel● provides scalars, strings, associative arrays,

structs and unions, pointers and access to kernel variables (e.g. `kmem_flags)

● includes basic arithmetic, logical and relational expressions

● built-in variables– pid, ppid, tid, cpu, pset, probename, probefunc, ...

Page 21: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 21

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The Good Stuff: Aggregations

● aggregations have an outstanding property– applying a function to a subset of the data and again

to the achieved results gives the same result as applying the aggregation to the whole dataset

– SUM is an aggregating function MEDIAN is not

● helps a lot to condense data– no need to keep the complete dataset

– no scaling problems

● printa() might be used to format the printout

Page 22: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 22

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

power-of-two distribution

Example: write() Timing

#!/usr/sbin/dtrace -s

/* example take from /usr/demo/dtrace/writetimeq.d */

syscall::write:entry{

self->ts = timestamp;}

syscall::write:return/ self->ts / {

@time[execname] = quantize(timestamp - self->ts);self->ts = 0;

}

Page 23: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 23

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: write() Timingsge_commd value ------------- Distribution ------------- count 8192 | 0 16384 |@@@@@@@@@@@@@@@@@ 9 32768 |@@@@@@@@@@@@@@@ 8 65536 |@@@@@@@@ 4 131072 | 0

ansys.e100 value ------------- Distribution ------------- count 32768 | 0 65536 |@@@@@@@@@@@@@@@@@ 639 131072 |@@@@@@@@@@@@@ 508 262144 |@@@@@@@@ 316 524288 |@ 431048576 | 0

Page 24: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 24

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

More On DTrace Output

● trace() prints collected data in a predefined way● tracemem() copies some memory junk to the

DTrace buffer● stack() and ustack() dump kernel- or user stack

hint: both can be used as aggregation “index”● printf(), printa() work similar to the “C” version

with some extensions● aggregations are printed by default when exiting

Page 25: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 25

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: I/O Observations

#!/usr/sbin/dtrace -s

#pragma D option quiet

io:::start {@c[args[1]->dev_statname, args[2]->fi_name, execname] =

sum(args[0]->b_bcount);}

END {printf("%10s %20s %15s %10s\n",

"DEVICE", "FILE", "APP", "BYTES");printa("%10s %20s %15s %10@d\n", @c);

}

Page 26: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 26

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: I/O Observationsobi-wan# ./fileio.d^CDEVICE FILE APP BYTES ssd6 <none> faker 1130496 nfs12 489.dat sched 32768 nfs12 478.dx sched 32768 ssd0 cis_Pd_complex.chk l502.exe 5980160 ssd2 cis_Pd_complex.chk l502.exe 6356992 ssd2 pt_slab.DM siesta-constr 7946240 ssd0 pt_slab.DM siesta-constr 7995392 md10 cis_Pd_complex.chk l502.exe 12337152 md10 pt_slab.DM siesta-constr 15941632

SVM mirrored volume

ssd0 Gau-3413.rwf l502.exe 16588800 ssd2 Gau-3413.rwf l502.exe 17686528 md10 Gau-3413.rwf l502.exe 34275328

Page 27: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 27

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The pid Provider

● the pid provider allows you to trace any function or instruction in a user process

● probes are created on demand and will therefore not appear in the “dtrace -l” output

● instructions are specified as function offset– pid54321:my-object:my-function:8

● works by default similar to the fbt (function boundary provider) which is for the kernel level

Page 28: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 28

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

commandline arguments

Example: Library Call Stacks

#!/usr/sbin/dtrace -s

/* uses the 'pid' provider to print call stacks */

#pragma D option quiet

pid$target:$1:$2:entry{

printf("%s:%s:%s", probeprov, probemod, probefunc);ustack();

}

Page 29: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 29

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: Library Call Stacksobi-wan# ./piddemo1.d -c ls 'libc' 'str*'pidtest.d

pid1002:libc.so.1:strcmp libc.so.1`strcmp libc.so.1`setlocale+0x1378 ls`main+0x22 ls`_start+0x7apid1002:libc.so.1:strlen libc.so.1`strlen ls`xstrdup+0x12 ls`main+0x486 ls`_start+0x7apid1002:libc.so.1:strlen libc.so.1`strlen ls`xstrdup+0x12

output of “ls”

Page 30: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 30

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: Library Timing (Inclusive)#!/usr/sbin/dtrace -s

/* uses the 'pid' provider to examine library timing */

#pragma D option quiet

pid$target:$1:$2:entry { self->ts = vtimestamp; }

pid$target:$1:$2:return/ self->ts /{ @t[probemod, probefunc] = sum(vtimestamp -self->ts); self->ts = 0;}

END { printa("%10@dns %12s:%s\n", @t); }

Page 31: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 31

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: Library Timing (Inclusive)

obi-wan# OMP_NUM_THREADS=2 \./piddemo2.d -c './partest 10 10' libmtsk ''

... 63204ns libmtsk.so.1:spin_unlock 69472ns libmtsk.so.1:spin_lock 91216ns libmtsk.so.1:libmtsk_info_init 105080ns libmtsk.so.1:barrier_init 158324ns libmtsk.so.1:memmanage_init 160816ns libmtsk.so.1:threads_fini 184148ns libmtsk.so.1:memmanage_fini 358240ns libmtsk.so.1:sleep_at_barrier 922020ns libmtsk.so.1:slave_wait_for_work

Page 32: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 32

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

● DTrace code uses the sched and proc provider to observe thread scheduling and preempting

● simple test application consisting of nested loops compiled with -xautopar -xloopinfo

● resulting code runs in parallel● demonstrated DTrace techniques also apply to

any kind of local parallel application such as OpenMP based ones

Page 33: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 33

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Some details omitted,please check handout

Page 34: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 34

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The sched Provider

● makes probes related to CPU scheduling available● easy way to examine why threads sleep, run or

change priority● allows you to keep an eye on thread migration

critical for OpenMP or other local (SMP level) parallel applications

Page 35: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 35

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The sched Provider Probes

● on-cpu● off-cpu● preempt● remain-cpu● sleep● wakeup

fires when a thread begun execution

fires when it's about to end execution

thread will be preempted

dispatcher elected to continue thread

sleeping on synchronization object

current thread wakes a sleeping one

Page 36: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 36

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The proc Provider

● it's probes are related to– process creation and termination

– LWP creation and termination

– exec(2), fork(2)

– signal handling

Page 37: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 37

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

The proc Provider Probes

● exec-failure● exec-success● exit● lwp-create● lwp-start● lwp-exit● start

fires when exec(2) failed

when exec(2) succeeded

when process is exiting

on LWP creation

before first instruction is executed

current LWP is exiting

before first instruction is executed

Page 38: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 38

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example#include <stdio.h>#include <stdlib.h>

int main(int argc, char *argv[]) { long i, j, *a, *b; long iter = 100000 *atoi(argv[1]), rep = 100 *atoi(argv[2]);

a = malloc(iter * sizeof(long)); b = malloc(iter * sizeof(long)); for (i = 0; i < iter; i++) a[i] = b[i] = i; puts("LOOP"); for (j = 0; j < rep; j++) for (i = 0; i < iter; i++) a[i] *= b[i];}

Page 39: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 39

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example#!/usr/sbin/dtrace -s

/* traces threads in application to observe scheduling * behaviour with respect to CPUs and locality groups * * timing is done in micro seconds */

#pragma D option quiet

/* initialize timestamp */BEGIN{ baseline = walltimestamp; scale = 1000; /* convert nano- to microseconds */}

Page 40: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 40

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

init case detection

Detailed sched/proc Example/* sched:::on-cpu fires whenever a thread * starts executing on a CPU * need to handle three different cases: * - init, called just once for a given thread * - thread has been migrated from another CPU * - "catch everything else" clause */sched:::on-cpu/ pid == $target && !self->stamp / {

self->stamp = walltimestamp;self->lastcpu = curcpu->cpu_id;self->lastlgrp = curcpu->cpu_lgrp;stamp = (walltimestamp -baseline) / scale;

printf("TID=%-2d %9d:%-9d CPU %3d(%d) created\n",tid, stamp, 0, curcpu->cpu_id, curcpu->cpu_lgrp);

}

Page 41: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 41

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example/* clause 2: thread CPU migration */sched:::on-cpu/ pid == $target && self->stamp && self->lastcpu != curcpu->cpu_id /{

delta = (walltimestamp -self->stamp) /scale;self->stamp = walltimestamp;stamp = (walltimestamp -baseline) / scale;printf("TID=%-2d %9d:%-9d from-CPU %d(%d)

to-CPU %d(%d) migration\n",tid, stamp, delta, self->lastcpu, self->lastlgrp,curcpu->cpu_id, curcpu->cpu_lgrp);

self->lastcpu = curcpu->cpu_id;self->lastlgrp = curcpu->cpu_lgrp;

}

Page 42: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 42

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

/* clause 3: catch-all other sched:::on-cpu events */sched:::on-cpu/ pid == $target && self->stamp && self->lastcpu == curcpu->cpu_id /{

delta = (walltimestamp -self->stamp) /scale;self->stamp = walltimestamp;stamp = (walltimestamp -baseline) / scale;printf("TID=%-2d %9d:%-9d CPU %3d(%d)

restarted on same CPU\n",tid, stamp, delta,curcpu->cpu_id, curcpu->cpu_lgrp);

}

Page 43: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 43

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

/* fires just before thread get's kicked-off a CPU */sched:::off-cpu/ pid == $target && self->stamp /{

delta = (walltimestamp -self->stamp) /scale;self->stamp = walltimestamp;stamp = (walltimestamp -baseline) / scale;printf("TID=%-2d %9d:%-9d CPU %3d(%d)

taken from CPU\n",tid, stamp, delta,curcpu->cpu_id, curcpu->cpu_lgrp);

}

Page 44: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 44

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example/* fires when a thread is put asleep */sched:::sleep/ pid == $target && self->stamp /{ self->sobj = (

curlwpsinfo->pr_stype == SOBJ_MUTEX ? "kernel mutex" :curlwpsinfo->pr_stype == SOBJ_RWLOCK ? "kernel RW lock" : curlwpsinfo->pr_stype == SOBJ_CV ? "cond var" :curlwpsinfo->pr_stype == SOBJ_SEMA ? "kernel semaphore" : curlwpsinfo->pr_stype == SOBJ_USER ? "user-level lock" :curlwpsinfo->pr_stype == SOBJ_USER_PI ? "user-level PI lock" : curlwpsinfo->pr_stype == SOBJ_SHUTTLE ? "shuttle" : "unknown");

delta = (walltimestamp -self->stamp) /scale; self->stamp = walltimestamp; stamp = (walltimestamp -baseline) / scale; printf("TID=%-2d %9d:%-9d sleeping on '%s'\n",

tid, stamp, delta, self->sobj);}

Page 45: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 45

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

/* fires when LWP exits */proc:::lwp-exit/ pid == $target && self->stamp /{

stamp = (walltimestamp -baseline) / scale;printf("TID=%-2d %9d:%-9d exited\n",

tid, stamp, 0);}

Page 46: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 46

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

● output shows– thread ID

– relative time for thread referring to application start

– relative time referring to last event triggered by thread

– CPU number and locality group

– informational message

● be aware that output for several threads does not need to be in order as it's buffered and may come from different processors

Page 47: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 47

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example

obi-wan# ./threadsched.d -c './partest 10 10'

TID=1 0:0 CPU 25(0) createdTID=1 0:0 CPU 25(0) restarted on same CPUTID=1 0:0 CPU 25(0) taken from CPUTID=1 0:0 CPU 25(0) restarted on same CPUTID=1 10004:10004 CPU 25(0) taken from CPULOOPTID=1 69999:59995 CPU 25(0) restarted on same CPUTID=1 120000:50000 CPU 25(0) taken from CPUTID=1 130003:10002 CPU 25(0) restarted on same CPUTID=1 6640553:6510550 CPU 25(0) taken from CPUTID=1 6640553:0 from-CPU 25(0) to-CPU 30(0) migrationTID=1 6640553:0 CPU 30(0) restarted on same CPU

miss-ordered timing

TID=1 10070842:0 from-CPU 30(0) to-CPU 2(0) migrationTID=1 10070842:0 CPU 2(0) restarted on same CPUTID=1 10070842:3430289 CPU 30(0) taken from CPU

Page 48: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 48

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Enhancing the proc/sched Example

/* fires when a thread is put asleep * print call stack if thread sleeps on a * condition variable or user-level lock */sched:::sleep/ pid == $target &&

( curlwpsinfo->pr_stype == SOBJ_CV || curlwpsinfo->pr_stype == SOBJ_USER || curlwpsinfo->pr_stype == SOBJ_USER_PI) /

{ ustack();

}

Page 49: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 49

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc ExampleTID=1 190004:0 CPU 24(0) taken from CPUTID=1 210007:20002 CPU 24(0) restarted on same CPULOOPTID=2 6110504:5980503 sleeping on 'cond var' libc.so.1`__lwp_park+0x10 libc.so.1`cond_wait_queue+0x28 libc.so.1`cond_wait+0x10 libc.so.1`pthread_cond_wait+0x8 libmtsk.so.1`sleep_at_barrier+0x6c libmtsk.so.1`__mt_EndOfTask_Barrier_+0xb8 partest`_$d1B15.main libc.so.1`_lwp_start

TID=2 6110504:0 CPU 16(0) taken from CPU

Page 50: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 50

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Detailed sched/proc Example# simulating overcommitment: 3 threads 2 CPU processor set

obi-wan# OMP_NUM_THREADS=3 ./threadsched.d -c './partest 10 10' | grep migration | sort -n -k 2,2

TID=1 60000:60000 from-CPU 4(0) to-CPU 0(0) migrationTID=2 180009:50003 from-CPU 4(0) to-CPU 0(0) migrationTID=1 530037:0 from-CPU 0(0) to-CPU 4(0) migrationTID=1 630045:0 from-CPU 4(0) to-CPU 0(0) migrationTID=2 730054:100008 from-CPU 0(0) to-CPU 4(0) migrationTID=3 730054:0 from-CPU 4(0) to-CPU 0(0) migrationTID=1 830062:100008 from-CPU 0(0) to-CPU 4(0) migrationTID=2 830062:0 from-CPU 4(0) to-CPU 0(0) migrationTID=1 930071:0 from-CPU 4(0) to-CPU 0(0) migrationTID=3 930071:100008 from-CPU 0(0) to-CPU 4(0) migrationTID=1 1550123:0 from-CPU 0(0) to-CPU 4(0) migrationTID=1 1650131:0 from-CPU 4(0) to-CPU 0(0) migrationTID=2 1730137:80006 from-CPU 0(0) to-CPU 4(0) migration

Page 51: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 51

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Visualizing DTrace Output

● DTrace has no built-in visualization options but can create easy to parse output

● use perl(1) to postprocess data● tools like dot, which is part of the Graphviz

package, greatly help visualizing DTrace output– http://www.graphviz.org/

● gnuplot is another option

Page 52: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 52

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Graphviz Example: Execution Paths#!/usr/sbin/dtrace -s

proc:::exec { self->parent = execname;}

proc:::exec-success/ self->parent != NULL / {

@c[self->parent, execname] = count();self->parent = NULL;

}

END {printf("digraph ExecPaths{\n");printa(" \"%s\" -> \"%s\" [weight=%@d];\n",@c);printf("}\n");

}

Page 53: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 53

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Graphviz Example: Execution Paths

Page 54: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 54

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Visualization Using SunStudio

● er_kernel(1) creates an experiment using data gathered by DTrace in the Solaris kernel– er_kernel dd if=/dev/zero of=/dev/null \

bs=1024k count=10000

● results can be visualized either using er_print(1) or analyzer(1)

Page 55: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 55

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Page 56: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 56

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Destructive Actions

● a number of DTrace actions may change the state of a system

● these actions need to be explicitly enabled by “-w” or “destructive” pragma

● process destructive actions– stop(), raise(), copyout(), copyoutstr(), system()

● kernel destructive actions– breakpoint(), panic(), chill()

Page 57: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 57

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

process destructive action

Example: Time Warp

/* fake uname() syscall return values */

struct utsname uts;

syscall::uname:entry { self->buf = arg0;}

syscall::uname:return {p = (struct utsname *) copyin(self->buf,

sizeof(struct utsname));bcopy("OpenSunOS", p->sysname, sizeof(uts.sysname));bcopy("leia", p->nodename, sizeof(uts.nodename));bcopy("5.12", p->release, sizeof(uts.release));copyout(p, self->buf, sizeof(struct utsname));

}

Page 58: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 58

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Example: Time Warp

obi-wan# uname -aSunOS obi-wan 5.10 Generic_118833-33 sun4v sparc SUNW,Sun-Fire-T200

prompt set by shell, not by kernel

“-w” required becausedestructive actionsare used

obi-wan# dtrace -w -s timewarp.d &

obi-wan# uname -aOpenSunOS leia 5.12 Generic_118833-33 sun4v sparc SUNW,Sun-Fire-T200

Page 59: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 59

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Speculations

● outstanding DTrace debugging feature● great to catch sporadic or nondeterministic error

conditions● allow you to gather data and decide some time

after certain probes have fired if the trace should be committed or discarded

● codepath leading to IO operations which take “extremely” long may serve as an example

Page 60: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 60

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Spice Up Your Own Applications

● DTrace allows you to easily add providers and probes to your own code

#include <stdio.h>#include <sys/sdt.h>

int main (int argc, char *argv[]) { int c;

while ((c = getchar()) != EOF) { if (c == 10) continue; DTRACE_PROBE1(keypress, read, c); }}

Page 61: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 61

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Spice Up Your Own Applications

● definition is easy and straight forward● dtrace(1m) is later used to “compile” the code

provider keypress { probe read(int);};

#pragma D attributes Evolving/Evolving/Common provider keypress provider#pragma D attributes Private/Private/Common provider keypress module#pragma D attributes Private/Private/Common provider keypress function#pragma D attributes Evolving/Evolving/Common provider keypress name#pragma D attributes Evolving/Evolving/Common provider keypress args

Page 62: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 62

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Spice Up Your Own Applications

obi-wan# cc -O -c keypress.c

obi-wan# dtrace -32 -G -s keypress_d.d keypress.o

obi-wan# cc -O -o keypress keypress_d.o keypress.o -ldtrace

obi-wan# dtrace -c ./keypress -n \'keypress$target:::read { @[arg0] = sum(1);' < file

dtrace: description 'keypress$target:::read ' matched 1 probedtrace: pid 18828 exited with status 1

99 6 100 6 97 70

Page 63: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 63

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

DTrace Wrap-up

Page 64: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 64

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Some More Answers

● Do I need to recompile my code?– No, you don't have to. DTrace technology can be

applied without having access to the sources but of course having symbols not removed from the executable helps

– Recompiling allows you to add your own providers.

● Do I need to be root to make use the tool?– No, administrators can grant users secure access to the

DTrace facility using privileges in user_attr(4)

Page 65: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 65

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Starting Points

● my favorite still is the Solaris Dynamic Tracing Guide which can be downloaded as PDF file from http://docs.sun.com

● great examples can be found at– http://www.brendangregg.com/dtrace.html

– http://www.opensolaris.org

– /usr/demo/dtrace

– http://www.solarisinternals.com/si/dtrace/

Page 66: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 66

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Summary

● DTrace is an outstanding and awful powerful system analysis tool; a Swiss Army Knife useful to administrators as well as developers

● even more powerful if used with speculation feature or spiced-up own code

Page 67: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 67

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Case Study #1Tool Usage On T2000Integer Performance

Page 68: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 68

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

“john” Benchmark

● during T2000 tests we were looking at “john”● password “evaluation” tool● seemed to be the ideal candidate for quick'n dirty

benchmarking using it's DES benchmark feature– small memory footprint

– no floating point operation

Page 69: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 69

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Processors

● T2000, 1GHz UltraSPARC T1– 8 cores, 4 threads each using round-robin method on

active ones; single instruction issue in-order design

– 3MB unified 12-way L2 cache with 4 banks

– 64 entry each I-TLB and D-TLB cache per core

● V100, 550MHz UltraSPARC-IIe– single core with 4-way superscalar pipeline

– 512kB unified 4-way L2 cache

– 64 entry each I-TLB and D-TLB cache

Page 70: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 70

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

First Results

● used compiler flags -fast -xarch=native64

nau@v100:~/john-1.6.40/run> ./john --test --format:DESBenchmarking: Traditional DES [64/64 BS]... DONEMany salts: 230157 c/s real, 271314 c/s virtualOnly one salt: 231421 c/s real, 234701 c/s virtual

nau@t2000:~/john-1.6.40/run> ./john --test --format:DESBenchmarking: Traditional DES [64/64 BS]... DONEMany salts: 159731 c/s real, 159731 c/s virtualOnly one salt: 136724 c/s real, 136724 c/s virtual

Page 71: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 71

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Bad Performance

● comparing clock rates we expected number to be in the 400000-500000 range

● quick check on a 1GHz V240 (UltraSPARC-IIIi) proves the assumption could be right

nau@v240:~/john-1.6.40/run> ./john --test --format:DESBenchmarking: Traditional DES [64/64 BS]... DONE Many salts: 497999 c/s real, 497999 c/s virtualOnly one salt: 432274 c/s real, 433139 c/s virtual

Page 72: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 72

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Using cputrack(1m)

● cputrack gets us number of instructions per cycle

nau@t2000:~/john-1.6.40/run> cputrack -o profile -T 5 -t \ -c pic1=Instr_cnt ./john --test –format:DES

nau@t2000:~/john-1.6.40/run> cat profile time lwp event %tick pic1 5.019 1 tick 5013226812 3401744911 10.029 1 tick 6412314672 3182470158 10.032 1 exit 11426217732 6584425110 <-- ratio 0.58

nau@v100:~/john-1.6.40/run> cat profile time lwp event %tick pic1 5.021 1 tick 2700326309 5720338749 10.021 1 tick 2664938937 5298458427 10.064 1 exit 5386690436 11060175783 <-- ratio 2.05

Page 73: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 73

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

First “john” Problem Solved

● unexpected “bad” performance can be explained by looking at the CPU design

● single in-order issue pipeline of T1 versus 4-way superscalar one of the IIe processor

● lessons learned:– forget about clock rates

– the design matters

Page 74: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 74

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

What About Scaling?

● system considered to be designed for throughput

1 2 4 8 12 16 24 320

10000020000030000040000050000060000070000080000090000010000001100000120000013000001400000

Jobs

crypt/s

Page 75: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 75

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Unexpected Scaling

● scaling drops rapidly for more than 12 job;idea: related to 12-way L2 cache?

● changed Makefile to add hardware counter support for Sun Studio 11 analyzer and collect: -g -xhwcprof

● added T1 specific hints to analyzer rc-file(got them from Sun)– provide additional memory and cache information

Page 76: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 76

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Page 77: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 77

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Unexpected Scaling

● memory stalls seem to be the major problem● almost 99% are caused by a single routine– DES_bs_crypt_25

● let's have a look at hardware counter data related to the memory/cache subsystem

Page 78: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 78

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Page 79: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 79

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Page 80: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 80

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Memory Bank Hotspot

● one memory bank causes half of all stalls, even more, 85%, are caused by a single page

● Solaris uses larger pages for sun4v architecture to help small TLB caches; this may create hot spots as it limits cache mapping flexibility

● select a different cache-bin allocation algorithm (coloring) in /etc/system– set consistent_coloring=2

● use mdb(1m) to experiment with live system

Page 81: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 81

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Checking With busstat(1m)yedi# busstat -w dram0,pic0=bank_busy_stalls,\ pic1=mem_read_write 2

time dev event0 pic0 event1 pic12 dram0 bank_busy_stalls 193307539 mem_read_write 714007644 dram0 bank_busy_stalls 192952912 mem_read_write 713389046 dram0 bank_busy_stalls 194118606 mem_read_write 718666598 dram0 bank_busy_stalls 194180462 mem_read_write 7182210810 dram0 bank_busy_stalls 129238477 mem_read_write 4990866912 dram0 bank_busy_stalls 1029510 mem_read_write 64161014 dram0 bank_busy_stalls 10229 mem_read_write 11562

# modify consistent_coloring setting

2 dram0 bank_busy_stalls 524127 mem_read_write 4351604 dram0 bank_busy_stalls 515648 mem_read_write 4300716 dram0 bank_busy_stalls 946224 mem_read_write 6926738 dram0 bank_busy_stalls 970769 mem_read_write 71196610 dram0 bank_busy_stalls 13895124 mem_read_write 781299012 dram0 bank_busy_stalls 846321 mem_read_write 528989 14 dram0 bank_busy_stalls 3968 mem_read_write 5553

Page 82: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 82

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

This Scales!

● the scaling of 10.2 for 32 jobs is much better now

1 2 4 8 12 16 24 320

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

Jobs

crypt/s

Page 83: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 83

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Scaling Problem Solved

● know your tools● know your hardware● deploy kernel patches– scaling problem solved by patch in the meantime;

changes the default allocation behavior

Page 84: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 84

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Case Study #2Memory Placement

Optimizations

Page 85: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 85

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Memory Placement Optimization

● keyword MPO● Solaris is smart about allocating memory on

NUMA architectures (e.g. Opteron SMP)– memory is allocated local to the thread executing the

code (first touch)

● keep in mind that malloc(3c) does not return memory to the operating system

● DTrace sched provider reveals information about locality groups

Page 86: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 86

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

0,1 1 10 100 1000 10000 100000 10000000

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

Performance Benefit Of First Touch

OMP ROW 1 Thread

OMP ROW 2 Threads

OMP First Touch ROW 1 Thread

OMP First Touch ROW 2 Threads

Memory Footprint (Kbyte)

Per

form

ance

(M

flo

p/s

)

chart courtesy of Rud van der Pas

Page 87: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 87

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Thank you, andremember ...

Page 88: kiz University of Ulm -  · PDF fileDTrace buffer stack() and ustack

OSDevConBerlinMarch 1st 2007

Page 88

Looking Into The Black-Box- how the kernel may impact your application

© 2007, Thomas Nau, kiz, Universty of Ulm

Use TheDTrace!

©LucasFilm