2011 ecoop

34

Click here to load reader

Upload: bergel

Post on 20-May-2015

595 views

Category:

Technology


0 download

DESCRIPTION

Code profilers are used to identify execution bottlenecks and understand the cause of a slowdown. Execution sampling is a monitoring technique commonly employed by code profilers because of its low impact on execution. Regularly sampling the execution of an application estimates the amount of time the interpreter, hardware or software, spent in each method execution time. Nevertheless, this execution time estimation is highly sensitive to the execution environment, making it non reproductive, non-deterministic and not comparable across platforms.On our platform, we have observed that the number of messages sent per second remains within tight (±7%) bounds across a basket of 16 applications. Using principally the Pharo platform for experimentation, we show that such a proxy is stable, reproducible over multiple executions, profiles are comparable, even when obtained in different execution contexts. We have produced Compteur, a new code profiler that does not suffer from execution sampling limitations and have used it to extend the SUnit testing framework for execution comparison.

TRANSCRIPT

Page 1: 2011 ecoop

Counting Messages as a Proxy for Average

Execution Time in Pharo

ECOOP 2011 - Lancaster

Alexandre BergelPleiad lab, DCC, University of Chile

http://bergel.eu

Page 2: 2011 ecoop

2

www.pharo-project.org

The Mondrian Visualization Engine

Page 3: 2011 ecoop

3

“I like the cool new features of Mondrian, but in my setting, drawing a canvas takes 10 seconds, whereas it took only 7 yesterday. Please do something!”

-- A Mondrian user, 2009 --

Page 4: 2011 ecoop

4

“I like the cool new features of Mondrian, but in my setting, drawing my visualization takes 10 seconds, whereas it took only 7 yesterday. Please do something!”

-- A Mondrian user, 2009 --

Page 5: 2011 ecoop

54.8% {11501ms} MOCanvas>>drawOn: 54.8% {11501ms} MORoot(MONode)>>displayOn: 30.9% {6485ms} MONode>>displayOn: | 18.1% {3799ms} MOEdge>>displayOn: ... | 8.4% {1763ms} MOEdge>>displayOn: | | 8.0% {1679ms} MOStraightLineShape>>display:on: | | 2.6% {546ms} FormCanvas>>line:to:width:color: ... 23.4% {4911ms} MOEdge>>displayOn: ...

Result of Pharo profiler

5

Page 6: 2011 ecoop

32.9% {6303ms} MOCanvas>>drawOn: 32.9% {6303ms} MORoot(MONode)>>displayOn: 24.4% {4485ms} MONode>>displayOn: | 12.5% {1899ms} MOEdge>>displayOn: ... | 4.2% {1033ms} MOEdge>>displayOn: | | 6.0% {1679ms} MOStraightLineShape>>display:on: | | 2.4% {546ms} FormCanvas>>line:to:width:color: ... 8.5% {2112ms} MOEdge>>displayOn: ...

Yesterday version

6

Page 7: 2011 ecoop

7

On my machine I find 11 and 6 seconds. What’s going on?

“I like the cool new features of Mondrian, but in my setting, drawing my visualization takes 10 seconds, whereas it took only 7 yesterday. Please do something!”

-- A Mondrian user, 2009 --

Page 8: 2011 ecoop

How profilers work

Sampling the method call stack every 10 ms

A counter is associated to each frame

Each counter is incremented when being sampled

8

Page 9: 2011 ecoop

How profilers work

Sampling the method call stack every 10 ms

A counter is associated to each frame

Each counter is incremented when being sampled

Canvas drawOn: (1)MORoot displayOn: (1)MONode displayOn: (1)

Time = t

method call stack

9

Page 10: 2011 ecoop

How profilers work

Sampling the method call stack every 10 ms

A counter is associated to each frame

Each counter is incremented when being sampled

Canvas drawOn: (2)MORoot displayOn: (2)MONode displayOn: (2)

Time = t + 10 ms

MOEdge displayOn: (1)

method call stack

10

Page 11: 2011 ecoop

How profilers work

Sampling the method call stack every 10 ms

A counter is associated to each frame

Each counter is incremented when being sampled

Canvas drawOn: (3)MORoot displayOn: (3)MONode setCache (1)

Time = t + 20 ms

method call stack

11

Page 12: 2011 ecoop

How profilers work

The counter is used to estimate the amount of time spent

MONode setCache (1)

MOEdge displayOn: (1)

MONode displayOn: (2)

MORoot displayOn: (3)

Canvas drawOn: (3)

12

Page 13: 2011 ecoop

How profilers work

The counter is used to estimate the amount of time spent

MONode setCache (1) => 10 ms

MOEdge displayOn: (1) => 10 ms

MONode displayOn: (2) => 20 ms

MORoot displayOn: (3) => 30 ms

Canvas drawOn: (3) => 30 ms

13

Page 14: 2011 ecoop

Problem with execution sampling #1

Strongly dependent on the executing environment

CPU, memory management, threads, virtual machine, processes

Listening at a mp3 may perturb your profile

14

Page 15: 2011 ecoop

Problem with execution sampling #2

Non-determinism

Even using the same environment does not help

“30000 factorial” takes between 3 803 and 3 869 ms

15

Page 16: 2011 ecoop

Problem with execution sampling #3

Lack of portability

Profiles are not reusable across platform

Buying a new laptop will invalidate the profile you made yesterday

16

Page 17: 2011 ecoop

Counting messages to the rescue

Pharo is a Smalltalk dialect

Intensively based on sending message

Almost “Optimization-free compiler”

Why not to count messages instead of execution time?

17

Page 18: 2011 ecoop

Counting messages

Wallet >> increaseByOne money := money + 1

Wallet >> addBonus self increaseByOne; increaseByOne; increaseByOne.

aWallet addBonus=> 6 messages sent

18

Page 19: 2011 ecoop

Does this really work?

What about the program?

MyClass >> main self waitForUserClick

We took scenarios from unit tests, which do not rely on user input

19

Page 20: 2011 ecoop

Experiment A

The number of sent messages related to the average execution time over multiple executions

0

100000000

200000000

300000000

400000000

0 10000 20000 30000 40000

times (ms)

mes

sage

sen

ds

100 x 106

200 x 106

300 x 106

400 x 106

application

20

Page 21: 2011 ecoop

Experiment B

The number of sent messages more stable than the execution time over multiple executions

Application time taken (ms) # sent messages ctime% cmessages%Collections 32 317 334 359 691 16.67 1.05Mondrian 33 719 292 140 717 5.54 1.44Nile 29 264 236 817 521 7.24 0.22Moose 25 021 210 384 157 24.56 2.47SmallDude 13 942 150 301 007 23.93 0.99Glamour 10 216 94 604 363 3.77 0.14Magritte 2 485 37 979 149 2.08 0.85PetitParser 1 642 31 574 383 46.99 0.52Famix 1 014 6 385 091 18.30 0.06DSM 4 012 5 954 759 25.71 0.17ProfStef 247 3 381 429 0.77 0.10Network 128 2 340 805 6.06 0.44AST 37 677 439 1.26 0.46XMLParser 36 675 205 32.94 0.46Arki 30 609 633 1.44 0.35ShoutTests 19 282 313 5.98 0.11

Average 13.95 0.61

Table 2. Applications considered in our experiment (second and third columnsare average over 10 runs)

Estimating the sample regression line. For sake of completeness and providingeasy-to-reproduce results, we provide the necessary statistical material. Comple-mentary information may be easily obtained from standard statistical books [11].

For the least squares regression line y = a+b x, we have the following formulasfor estimating a sample regression line:

b =SS

xy

SSxx

a = y � b x

where y and x are the average of all y values and x values, respectively. They variable corresponds to the # sent messages column and x to time taken

(ms) in the table given above.

SSxy

=X

xy � (P

x)(P

y)

n

SSxx

=X

x

2 � (P

x)2

n

where n is number of samples (i.e., 16, the number of applications we haveprofiled). SS stands for “sum of squares.” The standard deviation of error forthe sample data is obtained from:

s

e

=

rPSS

yy

� b SS

xy

n� 2where SS

yy

=X

y

2 � (P

y)2

n

In the above formula, n�2 represent the degrees of freedom for the regressionmodel. Finally, the standard deviation of b is obtained with s

b

= s

epSS

xx

.

24

21

Page 22: 2011 ecoop

0

2500000

5000000

7500000

10000000

0 75 150 225 300time (ms)

num

ber o

f met

hod

invo

catio

ns

2.5 x 106

5.0 x 106

7.5 x 106

10.0 x 106

method

Experiment C

The number of sent messages as useful as the execution time to identify an execution bottleneck

22

Page 23: 2011 ecoop

Compteur

23

CompteurMethod>> run: methodName with: args in: receiver | oldNumberOfCalls v | oldNumberOfCalls := self getNumberOfCalls.

v := originalMethod valueWithReceiver: receiver arguments: args.

numberOfCalls := (self getNumberOfCalls - oldNumberOfCalls) + numberOfCalls - 5. ˆ v

Page 24: 2011 ecoop

New primitive in the VM

24

CompteurMethod>> run: methodName with: args in: receiver | oldNumberOfCalls v | oldNumberOfCalls := self getNumberOfCalls.

v := originalMethod valueWithReceiver: receiver arguments: args.

numberOfCalls := (self getNumberOfCalls - oldNumberOfCalls) + numberOfCalls - 5. ˆ v

Page 25: 2011 ecoop

Cost of the instrumentation

25

0

750

1500

2250

3000

0 10000 20000 30000 400001

10

100

1000

10000

0 10000 20000 30000 40000

(a) Linear scale (b) Logarithmic scale

Overhead (%) Overhead (%)

Execution time (ms) Execution time (ms)

Page 26: 2011 ecoop

Contrasting Execution Sampling with Message Counting

No need for sampling

Independent from the execution environment

Stable measurements

26

Page 27: 2011 ecoop

Application #1Counting messages in unit testing

CollectionTest>>testInsertion self assert: [ Set new add: 1] fasterThan: [Set new add: 1; add: 2]

27

Page 28: 2011 ecoop

MondrianSpeedTest>> testLayout2 | view1 view2 | view1 := MOViewRenderer new. view1 nodes: (Collection allSubclasses). view1 edgesFrom: #superclass. view1 treeLayout.

view2 := MOViewRenderer new. view2 nodes: (Collection withAllSubclasses). view2 edgesFrom: #superclass. view2 treeLayout.

self assertIs: [ view1 root applyLayout ] fasterThan: [ view2 root applyLayout ]

28

Application #1Counting messages in unit testing

Page 29: 2011 ecoop

29

Application #2Differencing profiling

Comparison of two successive versions of a software

(not in the paper)

Page 30: 2011 ecoop

30

Application #2Differencing profiling

Comparison of two successive versions of Mondrian

(not in the paper)

Page 31: 2011 ecoop

More in the paper

Linear regression model

We replay some optimizations we had in our previous work

A methodology to evaluate profiler stability over multiple run

All the material to reproduce the experiments

31

Page 32: 2011 ecoop

Summary

Counting method invocation is a more advantageous profiling technique, in Pharo

Stable correlation between message sending and average execution time

32

Page 33: 2011 ecoop

Closing words

The same abstractions are used to profile applications written in C and in Java

Which objects is responsible of a slowdown?

Which arguments make a method call slow?

...

33

Page 34: 2011 ecoop

34

Counting message as a proxy for average execution timeAlexandre Bergelhttp://bergel.eu

0

100000000

200000000

300000000

400000000

0 10000 20000 30000 40000

times (ms)

mes

sage

sen

ds

100 x 106

200 x 106

300 x 106

400 x 106

0

2500000

5000000

7500000

10000000

0 75 150 225 300time (ms)

num

ber o

f met

hod

invo

catio

ns

2.5 x 106

5.0 x 106

7.5 x 106

10.0 x 106

CollectionTest>>testInsertionself assert: [Set new add: 1] fasterThan: [Set new add: 1; add: 2]

0

750

1500

2250

3000

0 10000 20000 30000 400001

10

100

1000

10000

0 10000 20000 30000 40000

(a) Linear scale (b) Logarithmic scale

Overhead (%) Overhead (%)

Execution time (ms) Execution time (ms)