adaptive optimization with on-stack replacement stephen j. fink ibm t.j. watson research center feng...
TRANSCRIPT
![Page 1: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/1.jpg)
Adaptive Optimization with On-Stack Replacement
Stephen J. Fink IBM T.J. Watson Research Center
Feng Qian (presenter)Sable Research Group, McGill University
http://www.sable.mcgill.ca
![Page 2: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/2.jpg)
Motivation
Modern VM uses adaptive recompilation strategies
VM replaces entry in dispatching table with newly compiled code
Switching to new code can only happen at the next invocation
On-stack replacement (OSR) allows transformation happen in the middle of method execution
![Page 3: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/3.jpg)
What is On-stack Replacement?
Transfer execution from compiled code m1 to compiled code m2 even while m1 runs on some thread’s stack
stack
PC
frame
m1
m1
stack
PC
frame
m2
m2
![Page 4: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/4.jpg)
Why On-Stack Replacement (OSR)?
Debugging optimized code via dynamic de-optimization [SELF-93]
Deferred compilation of cold paths in a method [SELF-91, HotSpot, Whaley 2001]
Promotion of long-run activations [SELF-93]
Safe invalidation for speculative optimization [HotSpot, SELF-91]
![Page 5: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/5.jpg)
Related Work
Holzle, Chambers, and Ungar (SELF-91, SELF-93) deferred compilation, de-optimization for debugging, promotion of long-run loops, safe invalidation [OOPSLA’91, PLDI’92, OOPSLA’94]
HotSpot server compiler [JVM’01]
Partial method compilation [OOPSLA’01]
![Page 6: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/6.jpg)
OSR Challenges
Engineering Complexity How to minimize disruption to VM code base? How to constrain optimizations?
Policies for applying OSR How to make rational decisions for applying OSR?
Effectiveness How does OSR improve/constrain dataflow
optimizations? How effective are online OSR-based optimizations?
![Page 7: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/7.jpg)
Outline
Motivation OSR Mechanism Applications Experimental Results Conclusion
![Page 8: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/8.jpg)
OSR Mechanism Overview
Extract compiler-independent state from a suspended activation for m1
Generate specialized code m2 for the suspended activation
Compile and transfer execution to the new code m2
m2
stack
PC
frame
m1
m1
compiler-
independent state
stack
PC
frame
m2
m2
1 2 3
![Page 9: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/9.jpg)
JVM Scope Descriptor
Compiler-independent state of a running activation
Based on Java Virtual Machine Architecture Five components:
1) Thread running the activation2) Reference to the activation's stack frame3) Program Counter (as a bytecode index)4) Value of each local variable5) Value of each stack location
![Page 10: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/10.jpg)
class C { static int sum(int c) { int y = 0; for (int i=0; i<c; i++) { y += i; } return y; }}
Running thread: MainThreadFrame Pointer: 0xSomeAddressProgram Counter: 16Local variables: L0(c) = 100; L1(y) = 1225; L2(i) = 50;Stack Expressions: S0 = 50; S1 = 100;
JVM Scope Descriptor 0 iconst_0 1 istore_1 2 iconst_0 3 istore_2 4 goto 14 7 iload_1 8 iload_2 9 iadd 10 istore_1 11 iinc 2 1 14 iload_2 15 iload_0 16 if_icmplt 7 19 iload_1 20 ireturn
Bytecode
JVM Scope Descriptor Example
Suspend after50 loop iterations(i = 50)
![Page 11: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/11.jpg)
Extracting JVM Scope Descriptor
Trivial from interpreter Optimizing Compiler
Insert OSR Point (safe-point) instructions in initial IR OSR Point uses stack, local state needed to recover
scope descriptor OSR Point is treated as a call, transfers control to exit
block Aggregate OSR points to an OSR map when generating
machine instructionsstack
PC
frame
m1
m1
compiler-
independent state
1
![Page 12: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/12.jpg)
Specialized Code Generation
Prepend a specialized prologue to original bytecode
Prologue will• Save JVM Scope Descriptor values into local variables• Push JVM Scope Descriptor values onto the stack• Jump to the desired program counter
m2
compiler-
independent state
2
![Page 13: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/13.jpg)
Running thread: MainThreadFrame Pointer: 0xSomeAddressProgram Counter: 16Local variables: L0(c) = 100; L1(y) = 1225; L2(i) = 50;Stack Expressions: S0 = 50; S1 = 100;
JVM Scope Descriptor
ldc 100 istore_0 ldc 1225 istore_1 ldc 50 istore_2 ldc 50 ldc 100 goto 160 iconst_0 ...16 if_icmplt 7 ...20 ireturn
Specialized Bytecode 0 iconst_0
1 istore_1 2 iconst_0 3 istore_2 4 goto 14 7 iload_1 8 iload_2 9 iadd 10 istore_1 11 iinc 2 1 14 iload_2 15 iload_0 16 if_icmplt 7 19 iload_1 20 ireturn
Original Bytecode
Transition Example
![Page 14: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/14.jpg)
m2
stack
PC
frame
m2
m2
3
Transfer Execution to the New Code
Compile m2 as a normal method System unfolds the stack frame of m1 Reschedule the thread to execute m2 By construction, executing specialized m2 sets up
target stack frame and continues execution
m2
stack
PC
frame
m2
m2
3
![Page 15: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/15.jpg)
Suppose optimizer inlines A -> B -> C:
A'
stack
PC
frameA
A
1 2 3
JVM ScopeDescriptor A
JVM ScopeDescriptor C
JVM ScopeDescriptor B
C'
B'
stack
PC
frame
m2
C'
A'
B'
AA
frame
C'frame
A'
frame
B'
frame
Recovering from Inlining
![Page 16: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/16.jpg)
Inlining Example
foo_prime() { <specialized foo prologue> call bar_prime() goto A; ... bar(); A: ...}bar_prime() { <specialized bar prologue> goto B: ... B: ...}
void foo() { bar(); A: ... } void bar() { ... B: ... }
Wipe stackto caller C and call foo_prime
frame
A
stack
PC
frame
m2
foo'
bar'
C
frame
bar'
frame
foo'
Suspendat B: inA -> B
![Page 17: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/17.jpg)
Implementation Details
Target Compiler unmodified, except for .... New pseudo-bytecodes
Load literals (to avoid inserting new constants in constant pool)
Load an address/bytecode index: JSR return address on stack
Fix bytecode indices for GC maps, exception tables, line number tables
![Page 18: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/18.jpg)
Pros and Cons
Advantages mostly compiler-independent avoid multi-entry points of compiled code target compiler can exploit run-time constants
Disadvantage must compile target method twice (once for transition,
once for next invocation)
![Page 19: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/19.jpg)
Outline
MotivationOSR Mechanism Applications Experimental Results Conclusion
![Page 20: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/20.jpg)
Two OSR Applications
Promotion (see the paper for details) recompile a long-running activation
Deferred Compilation don't compile uncommon paths saves compile-time
x = 1; x = foo();
return x;
if (foo is currently final)
trap/OSR;
![Page 21: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/21.jpg)
Deferred Compilation
What's "infrequent"? static heuristics profile data
Adaptive recompilation decision is modified to consider OSR factors
Feng Qian:
Class initialization is called by a class loader, when do we need OSR
for it?
Feng Qian:
Class initialization is called by a class loader, when do we need OSR
for it?
![Page 22: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/22.jpg)
Outline
MotivationOSR MechanismApplications Experimental Results Conclusion
![Page 23: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/23.jpg)
Online Experiments
Eager : (by default) no deferred compilation OSR/static: deferred compilation for CHA-based inlining
only OSR/edge counts: deferred compilation w/online profile
data & CHA-based inlining
![Page 24: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/24.jpg)
Adaptive System Performance
First Run
0.8
0.9
1
1.1
1.2
com
pres
s
jess db
java
c
mpe
gaud
io
mtr
t
jack
g. m
ean
Per
form
ance
Rel
ativ
e to
Eag
er
OSR/ edge counts OSR/ static
bett
er
![Page 25: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/25.jpg)
Adaptive System Performance
Best Run of 10
0.8
0.9
1
1.1
1.2co
mpre
ss
jess db
java
c
mpegau
dio
mtr
t
jack
g.m
ean
Perf
orm
an
ce R
ela
tive t
o E
ag
er OSR/ edge counts OSR/ static
bett
er
![Page 26: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/26.jpg)
Promotions Invalidations
compress 3 6
jess 0 0
db 0 1
javac 0 10
mpegaudio 0 1
mtrt 0 5
jack 0 1
total 3 24
OSR ActivitiesSPECjvm98 size 100 First Run
![Page 27: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/27.jpg)
Outline
MotivationOSR MechanismApplicationsExperimental Results Conclusion
![Page 28: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/28.jpg)
Summary
A new On-stack replacement mechanism Online profile-directed deferred compilation Evaluation of OSR applications in JikesRVM
![Page 29: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/29.jpg)
Conclusion
Should a VM implement OSR?+Can be done with minimal intrusion to code
baseModest gains from deferred compilationNo benefit for class-hierarchy-based inlining+Debugging with dynamic de-optimization
valuable TODO: More advanced speculative
optimizations
Implementation is available to public in JikesRVM under CPL:
Linux/x86, Linux/PPC, and AIX/PPC
http://www-124.ibm.com/developerworks/oss/jikesrvm/
![Page 30: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/30.jpg)
Backup Slides
![Page 31: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/31.jpg)
Compile RateOffline Profile
![Page 32: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/32.jpg)
Compile RateOffline Profile
![Page 33: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/33.jpg)
Machine Code SizeOffline Profile
![Page 34: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/34.jpg)
Machine Code SizeOffline Profile
![Page 35: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/35.jpg)
Code QualityOffline Profile
![Page 36: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/36.jpg)
Code QualityOffline Profile
better
![Page 37: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/37.jpg)
Jikes RVM Analytic Recompilation Model
Definecur, current optimization level for method mTj, expected future execution time at level jCj, compilation cost at opt level j
Choose j > cur that minimizes Tj + CjIf Tj + Cj < Tcur recompile at level jAssumptions
Method will execute for twice its current duration Compilation cost and speedup based on offline average Sample data determines how long a method has executed
![Page 38: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/38.jpg)
Jikes RVM OSR Promotion Model
Given: Outdated activation A of method mDefine
L, last optimization level for any compiled version of mcur, current optimization level for activation A
Tcur , expected future execution time of A at level cur
CL , compilation cost for method m at opt level L
TL , expected future execution time of A at level L
If TL + CL < Tcur specialize A at level LAssumption
Outdated activation will execute for twice its current duration
![Page 39: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/39.jpg)
Jikes RVM Recompilation Model, with Profile-Driven Deferred Compilation
Definecur, current optimization level for method mTj, expected future execution time at level jCj, compilation cost at opt level j
P, percentage of code in m that profile data indicates was reached
Choose j > cur that minimizes Tj + P*CjIf Tj + P*Cj < Tcur recompile at level jAssumptions
Method will execute for twice its current duration Compilation cost and speedup based on offline average Sample data determines how long a method has executed
![Page 40: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/40.jpg)
Offline Profile experiments
Collect "perfect" profile data offline Mark any block never reached as "uncommon" Defer compilation of "uncommon" blocks Four configurations
Ideal: deferred compilation trap keeps no state liveIdeal-OSR: deferred compilation trap is valid OSR pointStatic-OSR: no profile data; defer compilation for CHA-based
inlining; trap is valid OSR pointEager: (default) no deferred compilation
![Page 41: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/41.jpg)
Compile RateOffline Profile
![Page 42: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/42.jpg)
Machine Code SizeOffline Profile
![Page 43: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/43.jpg)
Code QualityOffline Profile
![Page 44: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/44.jpg)
OSR Challenges
Engineering ComplexityHow to minimize disruption to VM code base?How to constrain optimizations?
Policies for applying OSRHow to make rational decisions for applying OSR?
EffectivenessHow does OSR improve/constrain dataflow optimizations?
How effective are online OSR-based optimizations?
![Page 45: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/45.jpg)
Recompilation ActivitiesFirst Run
O0 O1 O2 total O0 O1 O2 total
compress 17 7 2 26 13 9 6 28
jess 49 20 1 70 39 17 4 60
db 8 4 2 14 8 4 5 17
javac 171 19 2 192 168 16 3 187
mpegaudio
68 32 7 107 66 29 6 101
mtrt 57 14 3 74 61 11 3 75
jack 59 25 8 92 54 26 5 85
total 429 121 25 575 409 112 32 553
With OSR Without OSR
![Page 46: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/46.jpg)
Summary of Study (1)
Engineering ComplexityHow to minimize disruption to VM code base?
°Compiler-independent specialized source code to manage transition transparently
How to constrain optimizations?°Model OSR Points like CALLS in standard transformations
Policies for applying OSRHow to make rational decisions for applying OSR?
°Simple modifications to cost-benefit analytic model
![Page 47: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/47.jpg)
Summary of Study (2)
Effectiveness (for an implementation of online profile-directed deferred compilation)
How does OSR improve/constrain dataflow optimizations?
°small ideal benefit from dataflow merges (0.5 - 2.2%)°negligible benefit when constraining optimization for potential invalidation°negligible benefit for just CHA-based inlining
patch points + splitting + pre-existence good enough
How effective are online OSR-based optimizations? °average performance improvement of 2.6% on first run SPECjvm98 s=100°individual benchmarks range from +8% to -4%°negligible impact on steady state performance (best of 10 iterations)°adaptive recompilation model relatively insensitive, compiles 4% more methods
![Page 48: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/48.jpg)
Experimental Details
SPECjvm98, size 100Jikes RVM 2.1.1
FastAdaptiveSemispace configurationone virtual processor500MB heap
separate VM instance for each benchmarkIBM RS/6000 Model F80
six 500 MHz PowerPC 630'sAIX 4.3.34 GB memory
![Page 49: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/49.jpg)
Specialized Code Generation
Generate specialized m2 that sets up new stack frame and continues execution, preserving semantics.
Express the transition to new stack frame in source code (bytecode)
m2
compiler-
independent state
2
![Page 50: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/50.jpg)
Deferred Compilation
Don't compile "infrequent" blocks
x = 1; trap/OSR;
return x;
if (foo is currently final)
x = 1; x = foo();
return x;
if (foo is currently final)
![Page 51: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/51.jpg)
Experimental Results
Online profile-directed deferred compilation Evaluation
How much do OSR points improve optimization by eliminating merges?How much do OSR points constrain optimization?How effective is online profile-directed deferred compilation?
![Page 52: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/52.jpg)
Adaptive System Performance
![Page 53: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/53.jpg)
Adaptive System Performance
![Page 54: Adaptive Optimization with On-Stack Replacement Stephen J. Fink IBM T.J. Watson Research Center Feng Qian (presenter) Sable Research Group, McGill University](https://reader036.vdocument.in/reader036/viewer/2022070410/56649eb35503460f94bbb119/html5/thumbnails/54.jpg)
Online Experiments
Before optimizing, collect intraprocedural edge countersDefer compilation at blocks that profile data says not reachedIf deferred block reached
Trigger OSR and deoptimizeInvalidate compiled code
Modify analytic recompilation modelPromotion from baseline to optimizedCompile-time cost estimate modified according to profile data