1 allegro cl 9.0 internals duane rettig ilc 2012

30
1 Allegro CL 9.0 Internals Duane Rettig ILC 2012

Upload: jarvis-beckwith

Post on 15-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

1

Allegro CL 9.0 InternalsDuane Rettig

ILC 2012

2

Overview

• SMP Design Internals

• Fasl reader rewrite in Lisp

• Debugger Enhancements

3

SMP design goals

• Low Overhead for scalar operation

• Source Compatible

• Warnings for unsafe usage

• Tools for smp enhanced programming

4

SMP: Goal Realization

• Low Overhead

• 9.0 (non-smp) virtually the same as 8.2

• SMP has <10% hit over non-smp (sometimes faster)

• Other lisps have 15%-100% hit over 8.2

• (YMMV)

5

SMP: Goal Realization

• Source Compatible: Success

• Exceptions:

• hash-tables

• threads

• 4 new smp runtime-lisp files (compare 28 for new port)

6

SMP: Goal realization

• Warnings for unsafe usage

• smp-macros module (8.1 patch, integral in 8.2)

• not aggressive in looking for problems

• setf races

• potential deadlocks

• alists (but plists/getf are protected by convention)

7

SMP: goal realization

• Tools for enhanced SMP programming

• synchronizing operators and locks

• Separate GC

8

SMP: implementation

• Deprecation of SMP-unsafe Macros

• New macros

• Bindability and settability of specials

• synchronization operators

• GC

9

SMP: implementation

S

• Deprecated smp-unsafe macros

• without-interrupts, without-scheduling

• excl::fast / excl::atomically

• excl::*warn-smp-usage*

10

SMP: Implementation

• New Macros

• fast-and-clean: replaces (excl::fast (excl::atomically ...))

• with-pinned-objects: replaces (excl::fast (excl::atomically ...))

• with-delayed-interrupts: replaces without-{interrupts,scheduling}

• defvar-nonbindable

11

SMP:Implementation

• Bindability and Settability

• bindable, settable: defvar/defparameter

• not bindable, settable: defvar-nonbindable

• bindable, not settable: excl::defvar-nonsettable

• not bindable, not settable: defconstant

12

SMP:implementationold 8.1wide binding

headersize

valuewaste

symbol

headersize

global valuethread 1 locativethread 2 locative

thread 3 locativethread 4 locative

symbol locatives

headersize

valuewaste

sv-vectortag=type=

#xb

tag=2/type=#x70

tag=2/type=#x70

headervaluehash

functionnameplist

13

SMP:implementationnew wide binding

headervaluehash

functionnameplistsv-

vectorlock-

index

headervalue

symbolfunction

symbol

headersize

symbolthread 1 locativethread 2 locative

thread 3 locativethread 4 locative

symbol locatives

headervalue

symbolfunction

sv-vectortag=type=

#xbtag=#xb/

type=#x8btag=2/

type=#x85

14

smp: implementation

• Synchronization operators

• push-atomic/pop-atomic

• incf-atomic/decf-atomic

• update-atomic

• atomic-conditional-setf (implementor for above)

• atomic-conditional-setq (special operator)

15

smp: implementation

• Lower level smp operators

• get-atomic-modify-expansion

• excl::atomic-modify-form (may change!)

• excl::*force-csw-opcodes* (may change!)

• excl::defsetf-conditional (may change!)

16

SMP: implementation

• excl::defsetf-conditional built-in operator conversions

• excl::.inv-structure-ref

• excl::.inv-svref

• excl::.inv-car, excl::.inv-cdr

• excl::.inv-symbol-plist

• excl::.inv-global-symbol-value

17

SMP: implementation

• Lowest level operators

• gc-setf-protect-atomic

• ll :cas low-level instruction form

18

SMP: implementation

• with-locked-object

• good on any lockable object

• lighter weight than process-lock

• use with care to avoid deadlocks

• special versions for structs and streams

19

SMP: implementation

• Other higher-level synchronizing tools

• sharable-locks

• barriers

• queues (uses process-lock; with-locked-object is lighter weight)

• condition-variables

20

SMP Implementation: gc• Runs in separate (non-Lisp) thread

• Able to provide per-thread object allocation

• Currently implemented:

• conses: everywhere

• floats: on x86-64

• Synchronizes with all threads

• Still written in C

21

GC states• 0: Lisping: thread is running Lisp code; GC can’t happen

• 1: Foreign: thread is running foreign code; GC can happen

• 2: Blocking GC: thread is running Lisp code; GC wants it to pause

• 3: Blocked by GC: thread is trying to get from foreign to Lisp; GC is running

• 4: Beside GC: thread is running foreign code and GC is happening

22

GC state diagram

Lisping(0)

Foreign(1)

Blocking GC(2)

Beside GC(4)

Blocked by GC

(3)

thread goes lisping

thread goes foreign

GCDone

GCStarting

GC Starting (signal

thread:GC waits)

thread goes foreign

threadgoes

lisping

(threadwaits)

GC Done(post thread)

threadinvokes GC

threadinvokes GC

(thread waits)

(thread waits)

(post gc)22

23

Fasl Reader

• Rewritten in Lisp runtime code

• 50% faster than C version

• fewer transitions to/from C

• Started in 8.2; used only to load source debug info

• Used exclusively in 9.0

24

Fasl coding example

case ff_complex: /* tos = imag, tos-1 = real */ if (Building_BOTH) { LispVal comp = new_lisp_obj(TYPEcomplex, 0, 0); /* complex object is new */

*(nat *)((nat)comp + c_imag_adj/PtrScale) = (nat)f_pop(); *(nat *)((nat)comp + c_real_adj/PtrScale) = (nat)f_pop(); f_push(comp); } break;

(#.ff_complex (when (building-both) (let* ((imag (fasl-pop thread)) (real (fasl-pop thread))

(complex (q-qint-call sys::make-complex real imag))) (fasl-push complex thread))))

Lisp:

C:

24

25

DEBugger enhancements

• frame descriptor caching and validation

• source level debugger

26

Debugger: reimplementation of Api

• db:next-newer-frame hard to implement

• Always moving!

• either cached frames become invalid or have to start over each time

• :zoom on overflowed stack takes much too long

27

Frame descriptors

• Frame-descriptors are now larger

• newer, older, chain slots

• validated via interlock with their real frames.

• frame descriptor has argcount shadow slot

28

frames

(link)

return adddress

function

argcount

...

headerclass

fpnewerolderchain

(argcount)others ...

Stack Frame descrptor

29

Frame descriptors

• db:next-newer-frame is no longer programming suicide!

30

source level debugger

• Demo