2011 ecoop

Counting Messages as a Proxy for Average

Execution Time in Pharo

ECOOP 2011 - Lancaster

Alexandre BergelPleiad lab, DCC, University of Chile

http://bergel.eu

2

www.pharo-project.org

The Mondrian Visualization Engine

3

“I like the cool new features of Mondrian, but in my setting, drawing a canvas takes 10 seconds, whereas it took only 7 yesterday. Please do something!”

-- A Mondrian user, 2009 --

4

“I like the cool new features of Mondrian, but in my setting, drawing my visualization takes 10 seconds, whereas it took only 7 yesterday. Please do something!”


54.8% {11501ms} MOCanvas>>drawOn: 54.8% {11501ms} MORoot(MONode)>>displayOn: 30.9% {6485ms} MONode>>displayOn: | 18.1% {3799ms} MOEdge>>displayOn: ... | 8.4% {1763ms} MOEdge>>displayOn: | | 8.0% {1679ms} MOStraightLineShape>>display:on: | | 2.6% {546ms} FormCanvas>>line:to:width:color: ... 23.4% {4911ms} MOEdge>>displayOn: ...

Result of Pharo profiler

5

32.9% {6303ms} MOCanvas>>drawOn: 32.9% {6303ms} MORoot(MONode)>>displayOn: 24.4% {4485ms} MONode>>displayOn: | 12.5% {1899ms} MOEdge>>displayOn: ... | 4.2% {1033ms} MOEdge>>displayOn: | | 6.0% {1679ms} MOStraightLineShape>>display:on: | | 2.4% {546ms} FormCanvas>>line:to:width:color: ... 8.5% {2112ms} MOEdge>>displayOn: ...

Yesterday version

6

7

On my machine I find 11 and 6 seconds. What’s going on?

“I like the cool new features of Mondrian, but in my setting, drawing my visualization takes 10 seconds, whereas it took only 7 yesterday. Please do something!”


How profilers work

Sampling the method call stack every 10 ms

A counter is associated to each frame

Each counter is incremented when being sampled

8

How profilers work




Canvas drawOn: (1)MORoot displayOn: (1)MONode displayOn: (1)

Time = t

method call stack

9

How profilers work




Canvas drawOn: (2)MORoot displayOn: (2)MONode displayOn: (2)

Time = t + 10 ms

MOEdge displayOn: (1)

method call stack

10

How profilers work




Canvas drawOn: (3)MORoot displayOn: (3)MONode setCache (1)

Time = t + 20 ms

method call stack

11

How profilers work

The counter is used to estimate the amount of time spent

MONode setCache (1)

MOEdge displayOn: (1)

MONode displayOn: (2)

MORoot displayOn: (3)

Canvas drawOn: (3)

12

How profilers work

The counter is used to estimate the amount of time spent

MONode setCache (1) => 10 ms

MOEdge displayOn: (1) => 10 ms

MONode displayOn: (2) => 20 ms

MORoot displayOn: (3) => 30 ms

Canvas drawOn: (3) => 30 ms

13

Problem with execution sampling #1

Strongly dependent on the executing environment

CPU, memory management, threads, virtual machine, processes

Listening at a mp3 may perturb your profile

14


Non-determinism

Even using the same environment does not help

“30000 factorial” takes between 3 803 and 3 869 ms

15


Lack of portability

Profiles are not reusable across platform

Buying a new laptop will invalidate the profile you made yesterday

16

Counting messages to the rescue

Pharo is a Smalltalk dialect

Intensively based on sending message

Almost “Optimization-free compiler”

Why not to count messages instead of execution time?

17

Counting messages

Wallet >> increaseByOne money := money + 1

Wallet >> addBonus self increaseByOne; increaseByOne; increaseByOne.

aWallet addBonus=> 6 messages sent

18

Does this really work?

What about the program?

MyClass >> main self waitForUserClick

We took scenarios from unit tests, which do not rely on user input

19

Experiment A

The number of sent messages related to the average execution time over multiple executions

0

100000000

200000000

300000000

400000000

0 10000 20000 30000 40000

times (ms)

mes

sage

sen

ds

100 x 106

200 x 106

300 x 106

400 x 106

application

20

Experiment B

The number of sent messages more stable than the execution time over multiple executions

Application time taken (ms) # sent messages ctime% cmessages%Collections 32 317 334 359 691 16.67 1.05Mondrian 33 719 292 140 717 5.54 1.44Nile 29 264 236 817 521 7.24 0.22Moose 25 021 210 384 157 24.56 2.47SmallDude 13 942 150 301 007 23.93 0.99Glamour 10 216 94 604 363 3.77 0.14Magritte 2 485 37 979 149 2.08 0.85PetitParser 1 642 31 574 383 46.99 0.52Famix 1 014 6 385 091 18.30 0.06DSM 4 012 5 954 759 25.71 0.17ProfStef 247 3 381 429 0.77 0.10Network 128 2 340 805 6.06 0.44AST 37 677 439 1.26 0.46XMLParser 36 675 205 32.94 0.46Arki 30 609 633 1.44 0.35ShoutTests 19 282 313 5.98 0.11

Average 13.95 0.61

Table 2. Applications considered in our experiment (second and third columnsare average over 10 runs)

Estimating the sample regression line. For sake of completeness and providingeasy-to-reproduce results, we provide the necessary statistical material. Comple-mentary information may be easily obtained from standard statistical books [11].

For the least squares regression line y = a+b x, we have the following formulasfor estimating a sample regression line:

b =SS

xy

SSxx

a = y � b x

where y and x are the average of all y values and x values, respectively. They variable corresponds to the # sent messages column and x to time taken

(ms) in the table given above.

SSxy

=X

xy � (P

x)(P

y)

n

SSxx

=X

x

2 � (P

x)2

n

where n is number of samples (i.e., 16, the number of applications we haveprofiled). SS stands for “sum of squares.” The standard deviation of error forthe sample data is obtained from:

s

e

=

rPSS

yy

� b SS

xy

n� 2where SS

yy

=X

y

2 � (P

y)2

n

In the above formula, n�2 represent the degrees of freedom for the regressionmodel. Finally, the standard deviation of b is obtained with s

b

= s

epSS

xx

.

24

21

0

2500000

5000000

7500000

10000000

0 75 150 225 300time (ms)

num

ber o

f met

hod

invo

catio

ns

2.5 x 106

5.0 x 106

7.5 x 106

10.0 x 106

method

Experiment C

The number of sent messages as useful as the execution time to identify an execution bottleneck

22

Compteur

23

CompteurMethod>> run: methodName with: args in: receiver | oldNumberOfCalls v | oldNumberOfCalls := self getNumberOfCalls.

v := originalMethod valueWithReceiver: receiver arguments: args.

numberOfCalls := (self getNumberOfCalls - oldNumberOfCalls) + numberOfCalls - 5. ˆ v

New primitive in the VM

24

CompteurMethod>> run: methodName with: args in: receiver | oldNumberOfCalls v | oldNumberOfCalls := self getNumberOfCalls.

v := originalMethod valueWithReceiver: receiver arguments: args.

numberOfCalls := (self getNumberOfCalls - oldNumberOfCalls) + numberOfCalls - 5. ˆ v

Cost of the instrumentation

25

0

750

1500

2250

3000

0 10000 20000 30000 400001

10

100

1000

10000

0 10000 20000 30000 40000

(a) Linear scale (b) Logarithmic scale

Overhead (%) Overhead (%)

Execution time (ms) Execution time (ms)

Contrasting Execution Sampling with Message Counting

No need for sampling

Independent from the execution environment

Stable measurements

26

Application #1Counting messages in unit testing

CollectionTest>>testInsertion self assert: [ Set new add: 1] fasterThan: [Set new add: 1; add: 2]

27

MondrianSpeedTest>> testLayout2 | view1 view2 | view1 := MOViewRenderer new. view1 nodes: (Collection allSubclasses). view1 edgesFrom: #superclass. view1 treeLayout.

view2 := MOViewRenderer new. view2 nodes: (Collection withAllSubclasses). view2 edgesFrom: #superclass. view2 treeLayout.

self assertIs: [ view1 root applyLayout ] fasterThan: [ view2 root applyLayout ]

28

Application #1Counting messages in unit testing

29

Application #2Differencing profiling

Comparison of two successive versions of a software

(not in the paper)

30

Application #2Differencing profiling

Comparison of two successive versions of Mondrian

(not in the paper)

More in the paper

Linear regression model

We replay some optimizations we had in our previous work

A methodology to evaluate profiler stability over multiple run

All the material to reproduce the experiments

31

Summary

Counting method invocation is a more advantageous profiling technique, in Pharo

Stable correlation between message sending and average execution time

32

Closing words

The same abstractions are used to profile applications written in C and in Java

Which objects is responsible of a slowdown?

Which arguments make a method call slow?

...

33

34

Counting message as a proxy for average execution timeAlexandre Bergelhttp://bergel.eu

0

100000000

200000000

300000000

400000000

0 10000 20000 30000 40000

times (ms)

mes

sage

sen

ds

100 x 106

200 x 106

300 x 106

400 x 106

0

2500000

5000000

7500000

10000000

0 75 150 225 300time (ms)

num

ber o

f met

hod

invo

catio

ns

2.5 x 106

5.0 x 106

7.5 x 106

10.0 x 106

CollectionTest>>testInsertionself assert: [Set new add: 1] fasterThan: [Set new add: 1; add: 2]

0

750

1500

2250

3000

0 10000 20000 30000 400001

10

100

1000

10000

0 10000 20000 30000 40000

(a) Linear scale (b) Logarithmic scale

Overhead (%) Overhead (%)

Execution time (ms) Execution time (ms)

2011 ecoop

Technology

time ms

ms moroot displayon

ms monode displayon

times ms

ms moedge displayon

ms canvas drawon

sampledmonode displayon

sampledmoedge displayon