design and implementation of an embedded python run-time … · rice computer architecture group...
TRANSCRIPT
rice computer architecture group
Design and Implementation of an Embedded Python
Run-Time SystemThomas W. Barr, Rebecca Smith, Scott Rixner
Rice University, Department of Computer Science USENIX Annual Technical Conference, June 2012
!1
rice computer architecture group
The computing landscape
!2
rice computer architecture group
Microcontrollers: They’re everywhere.
!3
~1000s~1bil~10bilUncountable?
rice computer architecture group
Microcontrollers: They’re everywhere.
- Virtually no run-time support (yet) - Storage? - Memory allocation? - Interrupt handlers? - Console?
!4
rice computer architecture group
Microcontrollers: They’re everywhere.
- 32-bit ARM Cortex-M3 - 64-128KB RAM - Up to 512KB Flash - 50-100MHz
- Interpreters on microcontrollers? - Java Card - eLua - Python-on-a-chip
!5
rice computer architecture group
Owl: A Development Environment for Microcontrollers
- Run-time and toolchain - Open-source
- Based on p14p
- Interactive prompt
- Profilers - Programmers - A lot more
- It works. Today.
- Focus on two parts today - Store Python objects - Call C functions
!6
rice computer architecture group
Owl: Toolchain and run-time
!7
rice computer architecture group
Representing programs
- Python source code
!8
def foo(): a = 'foo' b = (42, 'bar') c = 27 + 3
rice computer architecture group
Representing programs
- Compiler builds code objects - Bytecode to represent user program
!9
Bytecodes: LOAD_CONST 0 STORE_NAME 0 ...
def foo(): a = 'foo' b = (42, 'bar') c = 27 + 3
rice computer architecture group
Representing programs
- Must include data along with code - Constants, names, etc. - .pyc file, .class file
!10
Bytecodes: LOAD_CONST 0 STORE_NAME 0 ...
Constants: 0: 'foo' 1: (42, 'bar') ...
rice computer architecture group
Loading programs
!11
File: 0: 'foo' 1: (42, 'bar') ...
Memory
rice computer architecture group
Loading programs
!12
File: 0: 'foo' 1: (42, 'bar') ...
Memory
str:
rice computer architecture group
Loading programs
!13
File: 0: 'foo' 1: (42, 'bar') ...
Memory
str: 'foo'
rice computer architecture group
Loading programs
!14
File: 0: 'foo' 1: (42, 'bar') ... str: 'foo'
Memory
str: 'bar'
tup: _, _
int: 42
rice computer architecture group
Loading programs
- Intermediate files - Python .pyc file, Java .class file, etc. - Loaded at runtime, then executed
- Load before execution - Copy and transform
- Link compound objects with references
- Makes sense on x86 - RAM is plentiful
- Doesn’t on microcontroller - Can't afford to copy anything
!15
rice computer architecture group
Loading programs
!16
str: 'foo'
Memory image
str: 'bar'
tup: _, _
int: 42- Owl memory image - Unique on interpreted
embedded systems - Store objects in form that
is used by the VM - Object headers - Byte-order
- How do we store compound objects?
packed tuple:
rice computer architecture group!17
str: 'bar'int: 42
- Packed tuples - Store objects inside of nested containers - Transplantable
- Computer → Controller
- Controller → Controller
- No pointers to fix up
Loading programs
rice computer architecture group
Memory images
- p14p, eLua use dynamic loaders - Objects copied from flash to RAM
- Owl uses memory images - Can stay in flash, not SRAM - 512KB flash vs. 96KB SRAM
- Results - 4x reduced SRAM footprint
!18
rice computer architecture group
Calling C library functions
- C is a necessary evil - Peripheral I/O libraries
- Generally vendor provided
- Must provide a way for Python to call C - Two techniques compared - Automatically expose functions from .h file - Trade-offs in space and speed
!19
rice computer architecture group
Calling C library functions
- User cannot directly control execution - In an interpreter loop - Need some way to let
programmer break out of loop
!20
bytecode = *IP; switch bytecode: case BINARY_ADD: // add two Python ints // together IP++; case JUMP: // change IP case CALL_FUNCTION: // create new frame // change IP to new frame
rice computer architecture group
Calling C library functions
!21
bytecode = *IP; switch bytecode: case BINARY_ADD: // add two Python ints together IP++; case JUMP: // change IP case CALL_FUNCTION: // create new frame // change IP to new frame
rice computer architecture group
Calling C library functions
!22
bytecode = *IP; switch bytecode: case BINARY_ADD: // add two Python ints together IP++; case JUMP: // change IP case CALL_FUNCTION: if (func_name == "uart_init") { call_uart_init(); } else { // create new frame // change IP to new frame }
rice computer architecture group
Calling C library functions
!23
bytecode = *IP; switch bytecode: case BINARY_ADD: // add two Python ints together IP++; case JUMP: // change IP case CALL_FUNCTION: if (func_name == "uart_init") { call_uart_init(); } else if (func_name == "uart_send") { call_uart_send(); } else { // create new frame // change IP to new frame }
rice computer architecture group
call_uart_init()
- Wrappers for functions - Interpreter calls
wrapper - Wrapper converts
arguments - Calls function - Wrapper converts result - Returns result on stack
- Call C as you call Python
6/5/12 Pygments — Python syntax highlighter
1/2pygments.org/demo/45439/?style=perldoc
Usethisstyle: perldoc Go
Home FAQ Languages Demo Whousesit? DownloadDocumentation Contribute
Entry45439out.c
SubmittedbyanonymousonJune6,2012at1:34a.m.Language:C.Codesize:680bytes.
Important:Thisentryisnotyetstoredinthedatabase.
Saveitpermanently or Deleteitnow
/* Variable declarations */PmReturn_t retval = PM_RET_OK;pPmObj_t p0;uint32_t peripheral;
/* If wrong number of arguments, raise TypeError */if (NATIVE_GET_NUM_ARGS() != 1) { PM_RAISE(retval, PM_RET_EX_TYPE); return retval;}
/* Get Python argument */p0 = NATIVE_GET_LOCAL(0);
/* If wrong argument type, raise TypeError */if (OBJ_GET_TYPE(p0) != OBJ_TYPE_INT) { PM_RAISE(retval, PM_RET_EX_TYPE); return retval;}
/* Convert Python argument to C argument */peripheral = ((pPmInt_t)p0)->val;
/* Actual call to native function */SysCtlPeripheralEnable(peripheral);
/* Return Python object */NATIVE_SET_TOS(PM_NONE);return retval;
Thissnippettook0.00secondstohighlight.
BacktotheEntryListorHome.!24
rice computer architecture group
Calling C library functionsimport gpio class Output: def __init__(self, portpin): self.port = PORTS[portpin[0]] self.pin = PINS[portpin[1]] # turn on GPIO module init_port(portpin[0]) # configure pin for output gpio.GPIOPinTypeGPIOOutput(self.port, self.pin) def write(self, value): if value: # turn pin on gpio.GPIOPinWrite(self.port, self.pin, self.pin) else: # turn pin off gpio.GPIOPinWrite(self.port, self.pin, 0)
!25
rice computer architecture group
Calling C library functionsimport gpio class Output: def __init__(self, portpin): self.port = PORTS[portpin[0]] self.pin = PINS[portpin[1]] # turn on GPIO module init_port(portpin[0]) # configure pin for output gpio.GPIOPinTypeGPIOOutput(self.port, self.pin) def write(self, value): if value: # turn pin on gpio.GPIOPinWrite(self.port, self.pin, self.pin) else: # turn pin off gpio.GPIOPinWrite(self.port, self.pin, 0)
!26
rice computer architecture group
Calling C library functions
- Wrapped functions - Manually write
wrappers - p14p, eLua
- Autowrapping - Similar to SWIG
- Simple - Any compiler - Any architecture - Lots of duplicated code
6/5/12 Pygments — Python syntax highlighter
1/2pygments.org/demo/45439/?style=perldoc
Usethisstyle: perldoc Go
Home FAQ Languages Demo Whousesit? DownloadDocumentation Contribute
Entry45439out.c
SubmittedbyanonymousonJune6,2012at1:34a.m.Language:C.Codesize:680bytes.
Important:Thisentryisnotyetstoredinthedatabase.
Saveitpermanently or Deleteitnow
/* Variable declarations */PmReturn_t retval = PM_RET_OK;pPmObj_t p0;uint32_t peripheral;
/* If wrong number of arguments, raise TypeError */if (NATIVE_GET_NUM_ARGS() != 1) { PM_RAISE(retval, PM_RET_EX_TYPE); return retval;}
/* Get Python argument */p0 = NATIVE_GET_LOCAL(0);
/* If wrong argument type, raise TypeError */if (OBJ_GET_TYPE(p0) != OBJ_TYPE_INT) { PM_RAISE(retval, PM_RET_EX_TYPE); return retval;}
/* Convert Python argument to C argument */peripheral = ((pPmInt_t)p0)->val;
/* Actual call to native function */SysCtlPeripheralEnable(peripheral);
/* Return Python object */NATIVE_SET_TOS(PM_NONE);return retval;
Thissnippettook0.00secondstohighlight.
BacktotheEntryListorHome.!27
rice computer architecture group
Foreign Function Interface
- Based on libffi - Calls C functions with arbitrary signature - Implements calling convention
- Custom port to Cortex-M3
- eFFI - Functions statically linked into program - Function pointer tables instead of DWARF symbol
headers - Compatible with statically linked embedded system
- No duplicated code
!28
rice computer architecture group
Comparison
- eFFI smaller - VM 2KB larger - Eliminates 21KB of wrappers
- Wrapper functions simpler - 2x faster, largely irrelevant in practice
0kb 10kb 20kb 30kb 40kb 50kb 60kb 70kb 80kb 90kb 100kb 110kb 120kb 130kb 140kb 150kb 160kb 170kb 180kb 190kb
interp.c - 13kb
Virtual machine34.9kb
Lib (C)9.8kb
(libm) - 10kb
ieee754-df.S - 6kb
ieee754-sf.S - 5kb
Math20.8kb
uip.c - 4kb
IP7.8kb
(usblib) - 7kb
(driverlib) - 6kb
Platform15.2kb
[bytecodes] - 5kb
[debug] - 5kb
[strings] - 9kb
[other] - 6kb
Lib (Py)25.0kb
adc - 4kb
i2c - 4kb
uart - 3kb
I/O (C)21.1kb
[bytecodes] - 5kb
[strings] - 14kb
[other] - 5kb
I/O (Py)26.5kb
0kb 10kb 20kb 30kb 40kb 50kb 60kb 70kb 80kb 90kb 100kb 110kb 120kb 130kb 140kb 150kb 160kb 170kb 180kb 190kb
interp.c - 13kb
Virtual machine36.6kb
Lib (C)10.2kb
(libm) - 10kb
ieee754-df.S - 6kb
ieee754-sf.S - 5kb
Math20.8kb
uip.c - 4kb
IP7.8kb
(usblib) - 7kb
Platform12.2kb
[bytecodes] - 5kb
[debug] - 5kb
[strings] - 9kb
[other] - 6kb
Lib (Py)25.4kb
[bytecodes] - 5kb
[strings] - 15kb
[other] - 7kb
I/O (Py)29.1kb
0kb 10kb 20kb 30kb 40kb 50kb 60kb 70kb 80kb 90kb 100kb 110kb 120kb 130kb 140kb 150kb 160kb 170kb 180kb 190kb
src8.8kb
(libc, libm) - 64kb
(other)65.9kb
lparser.c - 6kb
lvm.c - 5kb
lstrlib.c - 4kb
lbaselib.c - 4kb
lcode.c - 4kb
lapi.c - 4kb
lua63.5kb
luarpc.c - 6kb
modules20.6kb
platform.c - 4kb
uip.c - 4kb
ieee754-df.S - 7kb
arm10.2kb
0kb 10kb 20kb 30kb 40kb 50kb 60kb 70kb 80kb 90kb 100kb 110kb 120kb 130kb 140kb 150kb 160kb 170kb 180kb 190kb
tasks.c - 6kb
Source9.2kb
IntQueue.c - 4kb
Minimal13.2kb
(driverlib) - 7kb
(other)10.0kb
uip.c - 10kb
uip14.8kb
webserver13.0kb
bitmap.h - 5kb
RTOSDemo14.2kb
Figure 5: Static binary analysis, from top to bottom, of the Owl virtual machine using wrapped functions, using theforeign function interface, of the eLua virtual machine and of an example program using SafeRTOS.
a floating-point benchmark that simulates the motion ofthe Sun and the four planets of the outer solar system.
7 Results
This section presents an analysis of the Owl system. Theoverhead of including the virtual machine in flash canbe quite small, as low as 32 KB compared to 22 KB fora simple RTOS. We show that for the embedded work-loads, garbage collection has almost no impact on run-time performance. Finally, we show that using a loader-less architecture uses four times less SRAM than a tradi-tional system.
7.1 Static binary analysis
Figure 5 shows the output of the static binary analyzer.The four rows in the figure are the Owl run-time sys-tem using autowrapping, the Owl run-time system usingeFFI, the eLua interpreter, and the SafeRTOS demonstra-tion program. SafeRTOS is an open-source real-time op-erating system that is representative of the types of run-time systems in use on microcontrollers today.
Consider the Owl run-time system using autowrap-ping. The virtual machine section includes the interpreterand support code to create and manage Python objects.The math section includes support for software floatingpoint and mathematical functions, while the IP sectionprovides support for Ethernet networking. The platform
section is the StellarisWare peripheral and USB driver li-braries. The lib sections (C and Python) are the Pythonstandard libraries for the Owl run-time system. Finally,the I/O sections (C and Python) are the calls to the pe-ripheral library, wrapped by the autowrapper tool.
The Python standard libraries consume a significantfraction of the total flash memory required for the Owlrun-time system. The capabilities that these libraries pro-vide are mostly optional, and therefore can be removedto save space. However, they provide many useful andconvenient functionalities beyond the basic Python byte-codes, such as string manipulation. These sections alsoinclude optional debugging information (5 KB).
With eFFI, the binary is roughly 19 KB smaller, illus-trating the advantage of using a foreign function inter-face. While the code required to manually create stackframes and call functions marginally increases the size ofthe virtual machine and stores some additional informa-tion in the Python code objects, it completely eliminatesthe need for C wrappers.
The Owl virtual machine itself is actually quite small,approximately 35 KB. It contains all of the code nec-essary for manipulating objects, interpreting bytecodes,managing threads, and calling foreign functions. Thisis significantly smaller than eLua’s core, which takes up63 KB, and not much larger than the so-called “lightweight” SafeRTOS, which requires 22 KB (the Sourceand Minimal sections). Note also that the supposed com-pactness of SafeRTOS is deceptive, as it is statically
!29
rice computer architecture group
Owl: A Development Environment for Microcontrollers
!30
rice computer architecture group
Owl: A Development Environment for Microcontrollers
!31
- Yes, you can run a managed run-time on a microcontroller.
- It works, right now. - Autonomous car - Toys - Courseware
- ENGI128 at Rice
- Cookware - ...
rice computer architecture group
Owl: A Development Environment for Microcontrollers
- Yes, you can run a managed run-time on a microcontroller.
- It works, right now. - Autonomous car - Toys - Courseware
- ENGI128 at Rice
- Cookware - ...
!32
rice computer architecture group
Owl: A Development Environment for Microcontrollers
!33
- Yes, you can run a managed run-time on a microcontroller.
- It works, right now. - Autonomous car - Toys - Courseware
- ENGI128 at Rice
- Cookware - ...
rice computer architecture group
Owl: A Development Environment for Microcontrollers
- Interesting area for research - Two innovations presented here - More in paper - Far more in the future?
- Compare to C? - To MATLAB?
- Vital area for research - Microcontrollers are getting more common - Larger peripheral set is making programming harder
- We can and must reverse this trend
!34
rice computer architecture group
Owl: A Development Environment for Microcontrollers
!35