f ranz i nc. optimizing user code in allegro cl 5.0 by duane rettig

85
FRANZ INC. Optimizing User Code in Allegro CL 5.0 by Duane Rettig

Upload: philip-hoover

Post on 27-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

by Duane Rettig

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Optimization-related Lisp Architecture

• Static vs dynamic - Function dispatch

• Closure structure

• Foreign functions - entry-vec struct

• Disassemble extensions

FRANZ INC.

Architecture: Static Vs Dynamic

• Pure static: absolute

• Relocatable

• Shared libraries

• Dynamic shared libraries

• Dynamic functions

FRANZ INC.

Static Programs

• Absolute addresses• Fast startup• Fast running• Large• Not reconfigurable

•Code

•Data

•0

•sbrk

FRANZ INC.

Relocatable Programs

• Not tied to a base address

• Slightly longer startup times

• Fast running• Large• Not reconfigurable

•Code

•Data

•Reloc

FRANZ INC.

Programs that Use Shared Libraries

• Usually need relocation

• Smaller than non-shared libraries

• Faster startup times• Medium speed; may

start slow and gain speed after first use

• Not reconfigurable

•Main

•Lib 2

•Lib 3

•Lib 1

FRANZ INC.

Programs that Use Dynamic Shared Libraries

• May be absolute or relocatable

• May be very small• Very fast startup• Medium speed,

amortized over lib loading

• Reconfigurable

•Main

•Lib 2

•Lib 3

•Lib 1

FRANZ INC.

Programs that Dynamically Define Functions

• May be absolute or relocatable

• May be very small• Very fast startup• Medium speed,

amortized over function definitions

• Extremely reconfigurable

•Main•Lisp lib

•Heap

•Lib 1

•Lib 2•Functions

FRANZ INC.

Lisp Data Availability

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

T Size

Entry

Instructions…

Return

SymbolsLvaluesc-values

Nil

R/ S funcs

•nil

•func •Glob table

•function

•codevector

•pc

FRANZ INC.

C Data Availability

GOT

Func1

Func2

Func3

GOT

Func1

Func2

Func3

•lib1 •lib2

address

GOT

address

GOT

address

GOT

address

GOT

address

GOT

address

GOT

FRANZ INC.

Registers

Alpha HP X86 Mips Sparc 68K RS6000GOT r29 r27/r19 ebx r28 l7 static r2Args r16-r21 r26-r23 eax,edx r4-r11 i/o 0-4,5 (d0-d3) r3–r10NIL r14 r18 edi r23 g4 d6 r15

Tramp r9 r9 edi r21 g4 a4 r21Func r10 r17 esi r20 o5 / i5 a5 r13Name r15 r8 ebx r30 g2 a3 r20Count r4 r4 ecx r3 g3 d7 r16

FRANZ INC.

Calling Sequence: Lisp

• Caller:– Store caller-saves registers

– set up arguments and count

– load name register

– call trampoline *

– restore caller-saves registers

• Callee:– establish stack

– save function

– Execute body

– restore stack

– restore caller’s function

– return

FRANZ INC.

Calling Sequence: C

• Caller:– Store caller-saves registers

– set up args (no count)

– store caller’s context

– call function, function desc, or stub

– restore caller’s context

– restore caller-saves registers

• Callee:– setup callee’s context

– establish stack

– store callee-saves registers

– Execute body

– restore callee-saves registers

– restore stack

– return

FRANZ INC.

Lisp’s Symbol Trampoline

• Required:– get function register from

name register

– get start address from function

– jump to start

• Optional:– save argument registers

– check for stack overflow

– jump to call-count code

– jump to single-step code

FRANZ INC.

Architecture: Closures

T Flgs Size

StartSharedPrivate T Size

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

T Flgs Size

StartShared

Private 0Private 1

…Private N

T Flgs Size

StartHashNameCode

FormalsAux1Aux2LocalsConst 0

…Const N

•External vec •Internal vec

FRANZ INC.

(in-package :excl)

(defstruct (entry-vec (:type (vector excl::foreign (*))) (:constructor make-entry-vec-boa ())) name ; entry point name (address 0) ; jump address for foreign code (handle 0) ; shared-lib handle (flags 0) ; ep-* flags (alt-address 0) ; sometimes holds the real func addr )

Architecture: Foreign Functions

•Entry-vec struct

FRANZ INC.

;; Entry-point constants:

(defconstant excl::ep-call-semidirect 1) ; Real address stored in alt-address slot(defconstant excl::ep-never-release 2) ; Never release the heap(defconstant excl::ep-always-release 4) ; Always release the heap(defconstant excl::ep-release-when-ok 8) ; Release the heap unless without-interrupts

(defconstant excl::ep-tramp-calls #x70) ; Make calls through special trampolines(defconstant excl::ep-tramp-shift 4)

(defconstant excl::ep-variable-address #x100) ; Entry-point contains address of C var

Architecture: Foreign Functions

•Entry-vec flags

FRANZ INC.

Architecture: Foreign Functions

T Size

NameAddressHandleFlags

Alt-address

•Entry vec

•missing_entry_point

•bind_and_call

•call_semidirect

•“foo”

•foo()

FRANZ INC.

Architecture: Foreign Functions

•Entry vec •Entry vec

•Entry vec

•Entry vec•Entry vec

•“foo”

•“bar”

•“bas”

•Excl::.saved-entry-points. table

FRANZ INC.

Architecture: Disassemble

– Extensions• non-lisp names

• :absolute

• :addr-list

• :find-callee

• :find-pc

• :references-only

• :recurse

• :target-class

FRANZ INC.

Disassembling non-lisp names

• A string representing a C entry point– Allows for viewing of non-lisp assembler

code– Some instructions are interpreted

automatically

FRANZ INC.

(disassemble "qcons");; disassembly of #("qcons" 1074935746)

;; code start: #x401237c2: 0: 8b 8f ff fd movl ecx,[edi-513] ; C_GSGC_NEWCONSLOC ff ff 6: 3b 8f 03 fe cmpl ecx,[edi-509] ; C_GSGC_NEWCONSEND ff ff 12: 0f 84 3c 1e jz 7758 ; cons+0 00 00 18: 89 41 0f movl [ecx+15],eax 21: 89 c8 movl eax,ecx 23: 89 50 13 movl [eax+19],edx 26: 83 87 ff fd addl [edi-513],$8 ; C_GSGC_NEWCONSLOC ff ff 08 33: c3 ret

FRANZ INC.

excl::*c-symbol-table* build:

• dirty (excl::*rebuild-c-symbol-table-p* is non-nil):– at lisp start– after load or unload of shared library

• rebuilt:– for disassemble of a string– for profiler analysis– for “:zoom :all t :verbose t” invocation

FRANZ INC.

(inspect excl::*c-symbol-table*)A simple T vector (3538) @ #x2039c352 0-> cstruct (2) = #("unidentified" 0) 1-> cstruct (2) = #("_init" 134514576) 2-> cstruct (2) = #("strcpy" 134514600) 3-> cstruct (2) = #("dlerror" 134514616) 4-> cstruct (2) = #("getenv" 134514632) 5-> cstruct (2) = #("fgets" 134514648) 6-> cstruct (2) = #("perror" 134514664) 7-> cstruct (2) = #("readlink" 134514680) 8-> cstruct (2) = #("malloc" 134514696) 9-> cstruct (2) = #("malloc" 134514696) 10-> cstruct (2) = #("_lxstat" 134514712) 11-> cstruct (2) = #("isspace" 134514728) 12-> cstruct (2) = #("_xstat" 134514744) 13-> cstruct (2) = #("__libc_init" 134514760) 14-> cstruct (2) = #("strrchr" 134514776) 15-> cstruct (2) = #("fprintf" 134514792) 16-> cstruct (2) = #("fprintf" 134514792) 17-> cstruct (2) = #("strcat" 134514808) 18-> cstruct (2) = #("chdir" 134514824) 19-> cstruct (2) = #("strncmp" 134514840) ... 3537-> cstruct (2) = #("__bss_start" 1075102200)

FRANZ INC.

USER(1): (defun foo (x) (list (bar x)))FOOUSER(2): (compile 'foo)Warning: While compiling these undefined functions were referenced: BAR.FOONILNILUSER(3):

•(simple function for next examples)

FRANZ INC.

(disassemble 'foo);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x203dcddc: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

FRANZ INC.

Disassembling with absolute addresses

• :absolute– Allows debug at absolute addresses– Warning: addresses may not be in sync after

gc, though per-disassemble consistency is maintained

FRANZ INC.

(disassemble 'foo :absolute t);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR204cb5a4: 55 pushl ebp204cb5a5: 8b ec movl ebp,esp204cb5a7: 56 pushl esi204cb5a8: 83 ec 24 subl esp,$36204cb5ab: 83 f9 01 cmpl ecx,$1204cb5ae: 74 02 jz 0x204cb5b2204cb5b0: cd 61 int $97 ; trap-argerr204cb5b2: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT204cb5b5: 74 02 jz 0x204cb5b9204cb5b7: cd 64 int $100 ; trap-signal-hit204cb5b9: 8b 5e 32 movl ebx,[esi+50] ; BAR204cb5bc: b1 01 movb cl,$1204cb5be: ff d7 call *edi204cb5c0: 8b d7 movl edx,edi204cb5c2: ff 57 2b call *[edi+43] ; QCONS204cb5c5: c9 leave204cb5c6: 8b 75 fc movl esi,[ebp-4]204cb5c9: c3 ret

FRANZ INC.

Disassemble support for the profiler

• addr-list

– Marks a specific instruction– Allows for exact profiler hits to be recorded

FRANZ INC.

(disassemble 'foo :addr-list -10);; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x204cb5a4: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 stopped --> 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

FRANZ INC.

(disassemble 'foo :addr-list '(11 (#x204cb5ae . 4) (#x204cb5b9 . 4) (#x204cb5c5 . 3)));; disassembly of #<Function FOO>;; formals: X;; constant vector:0: BAR

;; code start: #x204cb5a4: 0: 55 pushl ebp 1: 8b ec movl ebp,esp 3: 56 pushl esi 4: 83 ec 24 subl esp,$36 7: 83 f9 01 cmpl ecx,$1 4 (36%) 10: 74 02 jz 14 12: cd 61 int $97 ; trap-argerr 14: d0 7f a3 sarb [edi-93],$1; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 4 (36%) 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 3 (27%) 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

FRANZ INC.

Disassemble support for the debugger

• :find-callee

– Returns information given a relative pc

• :find-pc– Returns information about instruction

sequencing, or prints an instruction

• :references-only– Returns references from function or glob table

FRANZ INC.

USER(22): (disassemble 'foo :find-callee 26)BAR:CONST-1USER(23): (disassemble 'foo :find-callee 28)BAR:CALL0USER(24): (disassemble 'foo);; disassembly of #<Function FOO> ... 14: d0 7f a3 sarb [edi-93],$1 ; C_INTERRUPT 17: 74 02 jz 21 19: cd 64 int $100 ; trap-signal-hit 21: 8b 5e 32 movl ebx,[esi+50] ; BAR 24: b1 01 movb cl,$1 26: ff d7 call *edi 28: 8b d7 movl edx,edi 30: ff 57 2b call *[edi+43] ; QCONS 33: c9 leave 34: 8b 75 fc movl esi,[ebp-4] 37: c3 ret

USER(25):

FRANZ INC.

USER(28): (disassemble 'foo :find-pc 14)1417NILNILUSER(29): (disassemble 'foo :find-pc 17)171921:BCCUSER(30): (disassemble 'foo :find-pc '(:print 17)) 17: 74 02 jz 21USER(31): (disassemble 'foo :find-pc '(:print 21)) 21: 8b 5e 32 movl ebx,[esi+50] ; BAR

USER(32):

FRANZ INC.

USER(26): (disassemble 'foo :references-only t)(SYSTEM::QCONS BAR SYSTEM::C_INTERRUPT)USER(27):

FRANZ INC.

Miscellaneous Disassembler modes

• :recurse– Useful to control the amount of output

• :target-class– Used only in cross-porting

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Undocumented Tools in Allegro CL

• excl::get-objects (att #42-1)

• excl::get-references (typo in your notes)

• excl::create-box/excl::box-value(att #42-2)

• excl::atomically– allows compiler to guarantee atomic body

• Autoloading facilities (described later)

FRANZ INC.

Atomic forms

– Generally a form is atomic if it has• no interrupt-checks

• no consing

• no non-atomic forms or calls

– Use excl::atomically like progn; if it compiles, the body is atomic

– Atomic primcalls:• gsgc-setf-protect gsgc-set-protect fd-stack-real

qcar qcdr

– Atomic calls:• error excl::.error excl::eq-hash-fcn excl::eql-not-

eq excl::get_2op-atomic excl::sxhash-if-fast excl::symbol-hash-fcn

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Optimization Methodology

• Get it right first

• Profile it– The time macro– The Allegro CL profiler

• Hit the high cost items– Implementations– Algorithms

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Speed Optimizations

– Profiling– Efficient compilation– Immediate compilation– Foreign function optimizations– Hash tables– CLOS optimizations– Miscellaneous optimizations

FRANZ INC.

Speed Optimizations: Profiling

• Always compile top-level test functions

• Example profile run (att #48-1)

• Do not use time macro with profiler

• Avoid simultaneous time/call-count profiles

• When using time macro, beware of new closures

FRANZ INC.

Time macro: extra closures

(defun test-driver (n) (time (dotimes (i n) (test-it)))

•This driver is not as simple as it looks!

FRANZ INC.

Speed Optimizations: Efficient Compilation

• :explain

• excl::atomically

• excl:add-typep-transformer (att #50-1,2)

FRANZ INC.

Speed Optimizations: Immediate

Compilation• Inlining and unboxing

• Immediate-args

• defun-immediate (att #51-1,2,3)

FRANZ INC.

Speed Optimizations: Foreign Functions

• Call-direct (att #52-1,2)

• comp:list-call-direct-possibilities

FRANZ INC.

USER(2): (pprint (comp:list-call-direct-possibilities))

(("arg types:" (:FOREIGN-ADDRESS :LISP FIXNUM INTEGER SINGLE-FLOAT DOUBLE-FLOAT :SINGLE-FLOAT-NO-PROTO SIMPLE-STRING CHARACTER)) ("[also any one-dimensional simple-array is ok as an arg]") ("return types:" (SINGLE-FLOAT DOUBLE-FLOAT :SINGLE-FLOAT-FROM-DOUBLE :FIXNUM FIXNUM :MACHINE-INTEGER :INTEGER INTEGER :UNSIGNED-INTEGER :FOREIGN-ADDRESS :CHARACTER CHARACTER :LISP :VOID :BOOLEAN BOOLEAN)) ("unboxed machine-integer return types:" (FIXNUM INTEGER :FIXNUM :INTEGER :MACHINE-INTEGER)) ("bad return types for unboxing [some because they are keywords]:" (:FIXNUM :INTEGER :UNSIGNED-INTEGER :FOREIGN-ADDRESS :CHARACTER)))

FRANZ INC.

Speed Optimizations: Hash Tables

• Make-hash-table extensions

• Rehash issues

• excl::*default-rehash-size*

• excl::*allocate-large-hash-table-vectors-in-old-space*

• Convert-to-internal-fspec (example use of weak-key, sans value ht)

FRANZ INC.

Hash-table Architecture

T Size

T Size

T F …

Key…

Table

FRANZ INC.

Hash Table extensions

• :test extensions

• :values [t] (may be nil or :weak)

• :weak-keys [nil] (may be non-nil)

• :hash-function [nil] (or fboundp symbol)– Must return 16-bit value

FRANZ INC.

Speed Optimizations: Hash Tables

• Make-hash-table extensions

• Rehash issues

• excl::*default-rehash-size*

• excl::*allocate-large-hash-table-vectors-in-old-space*

• Convert-to-internal-fspec– example of weak-key, sans value hash-table

(att #57-1)

FRANZ INC.

Speed Optimizations: CLOS Optimizations

• Discriminators (att #58-1)

• For accessors, stay with unique names

• Outside of methods, use generic functions; inside of methods which specialize on the class, use slot-value

• Stay with standard-method-combination

• Avoid methods on slot-value-using-class

FRANZ INC.

Speed Optimizations: Misc Optimizations

• Applyn

• New array dimensions optimizations

• Know what functions are optimized:– compiler macros (use compiler-macro-

function)– compiler transforms (att #59-1)– compiler inliners (att #59-2)

• Fixed-index (att #59-3,4)

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Space Optimization Vs Consing Reduction

• Space has to do with overall size of the program; consing has to do with how much allocation/deallocation is occurring– The room function measures space– The time macro measures consing

FRANZ INC.

Space Optimizations

• Tools for space measurement

• Closures

• Presto

• Autoloading

• Foreign functions space considerations

• The .Pll file; strings and codevectors

• Generate-application

FRANZ INC.

Tools for Space Measurement

• (room t) and excl:print-type-counts

• excl::get-objects

• excl::get-references

• inspect– Use :raw mode for low-level inspection

FRANZ INC.

Space Optimization: Using closures

• Use of closures is smaller than expanding macros

• Closures are small themselves

• Closures can be precompiled in a file; late-binding can occur without run-time compilation

FRANZ INC.

Space Optimization: Presto

• Fully automatic

• Good for large (> 10 Mb) programs

• Not useful for small (< 2 Mb) programs

• Beware of delivery– Use sys:presto-build-lib to build new bundle

file

• Beware of use during development– Compile-file/load sequence can cause

corruption

FRANZ INC.

Space Optimizations: Autoloading

• Function (macro/generic-function)– excl::def-autoload-function– excl::autoload-it– excl::autoloadp

• Package– excl::*autoload-package-name-alist*

• Class– excl::*autoload-find-class-alist*

FRANZ INC.

// (pprint excl::*autoload-package-name-alist*)(("comp" . :COMPILER-S) ("compiler" . :COMPILER-S) ("cltl1".:CLTL1) ("db" . :DEBUG) ("debug" . :DEBUG) ("debugger" . :DEBUG) ("ds" . :DEFSYS-S) ("defsys" . :DEFSYS-S) ("defsystem" . :DEFSYS-S) ("fla" . :FLAVORS) ("flavors" . :FLAVORS) ("ff" . :FOREIGN-S) ("foreign-functions" . :FOREIGN-S) ("inspect" . :INSPECT) ("lep" . :LEP) ("prof" . :PROF-S) ("profiler" . :PROF-S) ("acl-socket" . :SOCK-S) ("socket" . :SOCK-S) ("scm" . :SCM) ("multiprocessing" . :PROCESS-S) ("mp" . :PROCESS-S) ("xref" . :XREF-S) ("cross-reference" . :XREF-S))NIL // (pprint excl::*autoload-find-class-alist*)((BROADCAST-STREAM . :STREAMA) (CONCATENATED-STREAM . :STREAMA) (ECHO-STREAM . :STREAMA) (SYNONYM-STREAM . :STREAMA) (TWO-WAY-STREAM . :STREAMA))NIL //

FRANZ INC.

Space Optimizations: Foreign Functions

• Use ff:def-foreign-call!

• Avoid code that pulls in the compatibility packages– ffcompat– defctype

• Avoid requiring aclwffi (can’t do this yet on CG/IDE programs)

FRANZ INC.

(pprint (macroexpand '(ff:def-foreign-call foo (x) :arg-checking nil)))

(PROGN (EVAL-WHEN (:COMPILE-TOPLEVEL) (EXCL::CHECK-LOCK-DEFINITIONS-COMPILE-TIME 'FOO 'FUNCTION 'FOREIGN-FUNCTIONS:DEF-FOREIGN-CALL (FBOUNDP 'FOO)) (PUSH 'FOO EXCL::.FUNCTIONS-DEFINED.)) (EVAL-WHEN (COMPILE LOAD EVAL) (REMPROP 'FOO 'SYSTEM::DIRECT-FF-CALL)) (SETF (FDEFINITION 'FOO) (EXCL::GET-FF-N-ARGS-CLOSURE (EXCL::DETERMINE-FOREIGN-ADDRESS '("foo" :LANGUAGE :C) 2 NIL) 'INTEGER '(0))) (EXCL::.INV-FUNC_FORMALS (FBOUNDP 'FOO) '(X)) (RECORD-SOURCE-FILE 'FOO) 'FOO)

FRANZ INC.

(pprint (macroexpand '(ff:def-foreign-call foo (x) :call-direct t))) (PROGN (EVAL-WHEN (:COMPILE-TOPLEVEL) (EXCL::CHECK-LOCK-DEFINITIONS-COMPILE-TIME 'FOO 'FUNCTION 'FOREIGN-FUNCTIONS:DEF-FOREIGN-CALL (FBOUNDP 'FOO)) (PUSH 'FOO EXCL::.FUNCTIONS-DEFINED.)) (EVAL-WHEN (COMPILE LOAD EVAL) (SETF (GET 'FOO 'SYSTEM::DIRECT-FF-CALL) (LIST '("foo" :LANGUAGE :C) T :C 'INTEGER '(INTEGER) '(:INT) 2))) (SETF (FDEFINITION 'FOO) (LET ((EXCL::F (NAMED-FUNCTION FOO (LAMBDA (X) (EXCL::CHECK-ARGS '(INTEGER) 'FOO X) (SYSTEM::FF-FUNCALL (LOAD-TIME-VALUE (EXCL::DETERMINE-FOREIGN-ADDRESS '("foo" :LANGUAGE :C) 2 NIL)) 0 X 'INTEGER))))) (EXCL::SET-FUNC_NAME EXCL::F 'FOO) EXCL::F)) (RECORD-SOURCE-FILE 'FOO) 'FOO)

FRANZ INC.

USER(7): (arglist 'excl::determine-foreign-address)(EXCL::NAME &OPTIONAL EXCL::FLAGS EXCL::METHOD-INDEX)TUSER(8): (apropos "FF-PASS-")FF::FF-PASS-TYPE-LISP value: 32FF::FF-PASS-TYPE-SINGLE-FLOAT value: 4FF::FF-PASS-TYPE-BY-REFERENCE value: 1FF::FF-PASS-TYPE-PROTOTYPED value: 2

•Pass-type flags

FRANZ INC.

Space Optimizations: The .Pll File

• Formerly known as the .lso file

• Establishes pure (read-only) shared space

• Currently supports codevectors and strings

• lso.h (att #72-1)

FRANZ INC.

Space Optimizations: Generate-application

• Keywords to specify as non-nil:– Keywords that start with “discard-”– :pll-file (but preferably :pure-files or :purify)– :presto (use only if helpful)

• Keywords to specify as nil:– Keywords that start with “include-” or “load-”

or “record-”– :preserve-documentation-strings

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Speed Vs Space Tradeoffs

• Inlining vs. Function calls

• comp:compile-format-strings-switch

• (comp:generate-inline-call-tests-switch)

• The typecase/case statement and the jump table

FRANZ INC.

Typecase/case Jump Tables

• Typecase normally turns into cond

• If typecase or case has right conditions, turns into a jump table

• Jump table has constant execution time

• Jump table can get large (up to 256 entries)

• Example for x86 (att #76-1)

FRANZ INC.

Optimizing User Code in Allegro CL 5.0

– Introduction– Optimization-related lisp architecture– Undocumented tools in Allegro CL– Optimization methodology– Speed optimizations– Space optimizations– Speed vs space tradeoffs– Lisp heap management

FRANZ INC.

Lisp Heap Management

• New strategy for heaps in 5.0

• malloc/free vs aclmalloc/aclfree

• Garbage collection problems

• Cons Reduction

FRANZ INC.

New Heap Strategy

• Dumplisp no longer fiddles with internal executable file formats

• Separate lisp-heap from “c-heap” (really aclmalloc space)

• Gaps are much less likely to occur

• Applications may use whatever malloc they choose

FRANZ INC.

malloc/free vs aclmalloc/aclfree

• Warning! excl::malloc and excl::free are not links to malloc and free!

• Allocations and deallocations must not be mixed– Some o/s-supplied frees will segv– At best, memory leaks will occur

FRANZ INC.

Common GC-related problems

• Too much paging– Newspace may be too large

• Too many scavenges– Newspace may be too small– Too much consing

• Global-GC takes too long– Be sure it is doing worthwhile work– Maybe turn it off!

FRANZ INC.

Cons Reduction

• Profile it - space profiler

• Identify unnecessary consing

• Replace consing operations with non-consing

FRANZ INC.

Profile it -The Trouble with Space Profiling

•Time profile •Space profile

•main •main

•app

•libs •libs

•app

FRANZ INC.

Identify unnecessary consing

• Application specific

• Sometimes characterized by many scavenges of extremely high efficiency

FRANZ INC.

Replace consing operations with non-consing

• Sometimes caused by system functions– Find alternates– Complain to vendor !

• Use resourcing strategies

• Use stack-allocations where possible

FRANZ INC.

Stack allocations

• <something> can be– flet or labels– list, list*, cons, and certain simple make-array

forms

(let ((x <something>)) (declare (dynamic-extent x)) ...or (<something> ((x ...)) (declare (dynamic-extent x)) ...

FRANZ INC.

Close

• Questions

• Kudos

• Tomatoes