one for (almost) all: using a modern programmable...

29
One for (Almost) All: Using a Modern Programmable Programming Language in Robotics Berthold Bäuml [email protected] Autonomous Learning Robots Lab Institute of Robotics and Mechatronics German Aerospace Center (DLR) or A Roboticist in Language Wonderland

Upload: others

Post on 21-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

One for (Almost) All: Using a Modern Programmable Programming Language in Robotics

Berthold Bä[email protected]

Autonomous Learning Robots LabInstitute of Robotics and Mechatronics

German Aerospace Center (DLR)

orA Roboticist in Language Wonderland

Page 2: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Trend in Software Development

• use of modern programming languages• moving away from dynamic scripting languages like Python or Ruby

Twitter

backend

Haskell

Deutsche Bank

tools for innovative trading group

backend ofChat Service

high speed trading“used for everything”

backend

• all functional and (almost all) strong static type system

Page 3: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Demands of Modern Software Systems

• execution performance

• interpreters are intrinsically slow!

• modern compilation and JIT techniques to come close to C/C++

Page 4: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Computer Language Benchmarks Game

SB

CL

Lisp

Sch

eme/

Rac

ket

Clo

jure

Sca

laH

aske

ll

OC

aml

Erla

ng

Java gc

cg+

+Fo

rtran

Java

7

Rub

yP

ytho

n

Page 5: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Demands of Modern Software Systems

• execution performance

• interpreters are intrinsically slow!

• modern compilation and JIT techniques to come close to C/C++

• maintenance and debugging

• dynamic scripting languages good for prototyping

• but for large code base and number of developers more language support for maintainability is needed

• modern strong and static type system + type inference uncovers many application logical errors as type errors at compile time

• support for functional programming: encourages/-forces immutability -> flow of state explicit -> easier to reason about

• productivity in developing complex algorithms and application logic

• function combination rather than object composition

• highly efficient functional data structures

• parallel execution

• support for multi-core/multi-CPU has to be built into the language

Page 6: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Functional Programming --

Functional Programming Languages• from Matthias Felleisen, “Functional Programming is Easy, and Good for You”

http://www.ccs.neu.edu/home/matthias/Presentations/11GS/gs.pdf

Functional programming is about clear, concise

communication between programmers.

A good transition needs training, but training pays off.

Functional programming languages keep you honest

about being functional.

Page 7: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Haskell Scala OCaml Erlang Clojure SchemeGambit/Racket

functional + + + + + +typed (static) + + + - - - / (+)

mutation - + + - + +

strictness lazy strict(&lazy) strict(&lazy) strict strict strict / strict (lazy)

parallel ++ + JoCaml ++ ++ Termite / +compiled native JVM native VM (native

HiPE) JVM C-code / VM-

JIT

Page 8: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Demands of Advanced Robotic Systems

• what robots we are talking about -> advanced complex humanoids

• different to, e.g., fleets of quadrocopters

• with respect to computing power necessary and available!

• esp. in our case: platform for fundamental research -> flexibility for developing new solutions from ground up more important than integrating many existing “classical” solutions

• ...

Page 9: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Demands of Advanced Robotic Systemssensing• stereo cameras (2MPixel/25Hz)RGB-D sensor (0.5MPixel/33Hz)• torque sensor (all DOF, 1kHz)tactile skin on hands (3000taxel/750Hz)•IMU (6D, 500Hz)

acting• 53 DOF = 8 (plattform) + 19 (torso) + 26 (hands)• torque control over all DOF• 1kHz, <3ms latency, <100us Jitter

computing• 4x Core i7 Quadcore (onboard)• CPU cluster with 64 cores• GPGPU cluster 16 NVidia K20

Page 10: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters
Page 11: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

hand control1kHz

arm/torso/head control (19DOF)

1kHz

platform control60Hz/50ms

state machine/communicator/view control

1kHz

pose estimator512Hz/0.5ms

circle detector25Hz/25ms

circle detector25Hz/25ms

MHT/UKF25Hz/10ms

ball tracker25Hz/35ms Linux/

QuadCore

planner25Hz/60ms

SQP optimizer

60ms

SQP optimizer

60ms

SQP optimizer

60ms

SQP optimizer

60msQNX/

32 Cores

QNX/2x DualCore

user interaction

GigE

GigE

1394

GigE

Sercos

SpaceWire

CAN

TD

TD

TD

USB

GUI3D viewer

Linux

logic planner

Linux

TD

TD

Page 12: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Demands of Advanced Robotic Systems

• easy interfacing/interaction with C/C++ for low-level, hard realtime, highly performant code (control algorithms, image processing)

• multi-platform support:robotic system are often heterogenous (realtime OS, drivers only for certain OS, ...)

• actor model and continuations:

• for concurrent, parallel and distributed computing

• to build complex synchronization and execution patterns for orchestration of interwoven task/behavior nets

• not purely functional: to easily work with changing states of robots and world

• Domain Specific Languages (DSLs)

• robotic system span wide range of different tasks

• DSLs can respect different abstractions also syntactically, e.g.,

• kinematic/dynamic/geometrical description

• complex and concurrent state machines

• data types of the communication packets

Page 13: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Domain Specific Languages --An Important Concept in Modern Software Design

DSLsarecool!

DSLs are cool!Domain specific languages are

• a little story ... by courtesy of Matthew Flatt, University of Utah(to see the animation run in Racket: (require (planet mflatt/princess:1:2/play-movie))

Page 14: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Domain Specific Languages --An Important Concept in Modern Software Design

LaTeX

doc

• a little story ... by courtesy of Matthew Flatt, University of Utah

Page 15: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Domain Specific Languages --An Important Concept in Modern Software Design

• a little story ...

• abstractions for different tasks/fields/domains often best expressed in specific language (with optimized syntax and semantics)

• embedded domain specific languages (DSELs) use infrastructure of implementing language and extend these: languages as libraries

• popular approaches for full fledged DSELs (including control structs)

• lazy functional languages (Haskell, Scala): functions and combinators

• meta-programming == syntax rewriting/manipulating the AST

• Lisps, Clojure, Scheme: macro systems “directly” manipulates S-exp

• Template Haskell, MetaOCaml, Scala Macros (since 2.10)

• if implementing language is a compiled language, also the DSL is a compiled (and efficient) language!

• even full fledged language extension possible ...

Page 16: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Macro Systems

• A macro extends a language by specifying how to compile a new feature into existing features

• The macro is itself implemented in theprogramming language, not an external tool.

• more on macros (taken from a talk of Robby Findler)...

• “history” of Scheme macros

• text replacing

• syntax replacing macros

• hygienic macros (obey lexical scoping)

• advanced macro systems

• with syntax object containing source location -> precise error messages

Page 17: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

• implemented with macros in Scheme (Racket)

• performance ~ interface-based Java calls

• M. Flatt, R. B. Findler, M. Felleisen “Scheme with Classes, Mixins, and Traits”

(class object%  (init size)                   (define current-size size)  

  (super-new)                   (define/public (get-size)    current-size)   (define/public (grow amt)    (set! current-size (+ amt current-size)))   (define/public (eat other-fish)    (grow (send other-fish get-size))))

(define fish% (class object% (init size) ....))(define charlie (new fish% [size 10]))

> (send charlie grow 6)

> (send charlie get-size)16

Java-like Class System as “DSEL”

Page 18: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Haskell Scala OCaml Erlang Clojure SchemeGambit/Racket

functional + + + + + +typed (static) + + + - - - / (+)

mutation - + + - + +

strictness lazy strict(&lazy) strict(&lazy) strict strict strict / strict (lazy)

parallel ++ + JoCaml ++ ++ Termite / +compiled native JVM native VM (native

HiPE) JVM C-code / VM-

JIT

Page 19: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Haskell Scala OCaml Erlang Clojure SchemeGambit/Racket

functional + + + + + +typed (static) + + + - - - / (+)

mutation - + + - + +

strictness lazy strict(&lazy) strict(&lazy) strict strict strict / strict (lazy)

parallel ++ + JoCaml ++ ++ Termite / +compiled native JVM native VM (native

HiPE) JVM C-code / VM-

JITdistributed Cloud Haskell actors JoCaml actors!! no native,

e.g., AkkaTermite / distrib.places

FFI (w/o glue) + JNA experimental + JNA + / +

platforms Lin/Mac/Win Lin/Mac/Win Lin/Mac/Win Lin/Mac/Win Lin/Mac/Win gcc / Lin/Mac/Win

DSELs (functional)

++ ++ + + + + / +

meta programming

Template Haskell

macros (experiment.)

MetaOCaml simple macros lisp macros macros / adv. macros

Page 20: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Radical System Architecture: Use One Language for (Almost) All• modern higher level functional languages

• are performant

• have “batteries included”: GUI, networking, FFI, ...

• fulfill (almost) all challenges in robotic

• radical design

• one language for all

• except for small C/C++ snippets (highest performance/determinism)

• language built-in parallel and distributed execution/communication

• efficient DSELs for the various data description or execution logic tasks

• benefits

• general higher productivity with higher level language

• homogeneity drastically reduces system complexity

• developers have to/can learn in-depth one language

• no conceptual or practical “frictional loss” due to language coupling

• same higher level concepts in all components (type system, data structures, closures, continuations, ...)

• distributed communication of higher level concepts -- in contrast to “least common denominator” of multi-language-compatible middleware (aRD: static packets, ROS: dynamic arrays, ...

Page 21: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

Page 22: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

Page 23: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

Page 24: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

GPU

Page 25: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

Page 26: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

GPU

state machine

planner

visiondriver

GUI

Simulink

Page 27: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

• follows this radical “one language” philosophy

• we chose the Scheme variant Racket as base language

• only for hard realtime (controllers) and high performance (image processing, GPU) we additionally need C/C++ and Matlab/Simulink models

• aRDx allows seamless integration of C/C++ and Matlab/Simulink with Racket up to module loading with auto-compilation

• aRDx provides highly performant and hard realtime capable communication layer for raw data transport for C/C++ -- setup of communication logic already in Racket

application

Racket

OS

aRDx

aRDxRT

C/C++

• first prototype (many things still missing) successfully works on Agile Justin

The aRDx Framework (aRD Next Generation)

Page 28: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

T. Hammer, B. Bäuml, “Raw Performance of Robotic Software Middleware: A Comparison and aRDx’s New Realtime Communication Layer”

Ê Ê Ê Ê Ê Ê Ê ÊÙ Ù Ù ÙÙ

Ù

Ù

Ù

Á Á ÁÁ

Á

Á

Á

Á

‡ ‡ ‡ ‡‡

Ï Ï Ï Ï ÏÏ Ï ÏÚ Ú Ú Ú Ú Ú Ú Ú

Ê aRDx Ù aRD

Á Orocos ‡ ROS

Ï ROS HfixedL Ú YARP

1 102 104 106 10810-6

10-4

10-2

1

packet size @byteD

roun

d-tri

ptim

e@sD

process

Ê Ê Ê Ê Ê Ê Ê ÊÙ Ù Ù Ù

Ù

Ù

Ù

Ù

Á Á Á ÁÁ

Á

Á

Á

‡ ‡

‡ ‡

Ú Ú Ú Ú ÚÚ

Ú

Ú

10-3 110-4

10-2‡ ‡

‡ ‡

*

pause @sD

1 102 104 106 10810-6

10-4

10-2

1

packet size @byteD

host

Ê Ê ÊÊ

Ê

Ê

Ê

Ê

Ù Ù ÙÙ

Ù

Ù

Ù

Ù

Á Á Á Á

Á

Á

Á

Á

‡ ‡

‡‡

Ú Ú Ú ÚÚ

Ú

Ú

Ú

1 102 104 106 10810-4

10-3

10-2

10-1

1

10

packet size @byteD

distributed

Ê Ê Ê Ê Ê Ê ÊÙ Ù Ù Ù

Ù

Ù

Ù

Á Á ÁÁ

Á

Á

‡ ‡ ‡ ‡ ‡

Ï Ï Ï Ï Ï Ï Ï ÏÚ Ú Ú Ú Ú Ú Ú

1 102 104 106 10810-6

10-4

10-2

1

packet size @byteD

roun

d-tri

ptim

e@sD

process

Ê Ê Ê Ê Ê Ê Ê

Ù Ù Ù Ù

Ù

Ù

Ù

Á Á Á ÁÁ

Á

Á

‡ ‡

‡ ‡

Ú Ú Ú Ú Ú

Ú

ÚÚ

1 102 104 106 10810-6

10-4

10-2

1

packet size @byteD

host

Ê Ê Ê

Ê

Ê

Ê

Ê

Ù Ù Ù

Ù

Ù

Ù

Ù

Á Á Á ÁÁ

Á

Á

‡ ‡

Ú Ú ÚÚ

Ú

Ú

Ú

1 102 104 106 10810-4

10-3

10-2

10-1

1

10

packet size @byteD

distributed

Fig. 5. Results of the stress test benchmark for 1 (top) and 20 (bottom) clients and for the three domains (columns). Each plot shows the mean round-triptime (averaged over some 100 runs) over the packet size for the various frameworks. Please, be aware of the log-log-scaling of the plots. The performance ofaRDx is almost always the best – most dramatically for the host domain where no other framework can provide zero-copy semantics. Only for small packetsizes (up to 1KB) where the transfer time is dominated by the constant overhead of a framework aRDx is beaten by aRD’s minimalistic implementationand in the 1-client case and large packets YARP is about 10% faster presumably due to a slightly more clever configuration of the TCP sockets. In the20-client case aRDx beats in the distributed domain all other frameworks by a factor of 2 because it has to transfer the packets sent from the master to theremote client only once and, hence, in each round of the test instead of 20+20 packets only 2+20 packets have to be transmitted over the GigE network.The increased constant overhead of aRDx for the host compared to the process domain is about 5x and, hence, close to the theoretically expected 6x dueto the indirect communication through the daemon. Interestingly, although aRDx needs a quite complex logic to provide zero-copy semantics in the hostdomain, its constant overhead is still 4x smaller than that of all other (except aRD) frameworks. In what follows we discuss some feature and quirks ofthe other frameworks we came about. All these frameworks scale very well and roughly linear with the number of clients. For the process domain YARPcan provide zero-copy semantics. In this domain ROS with its nodelets also was expected to show constant transfer times but could do so only after wefixed the implementation (labeled ROS fixed) – standard ROS (labeled ROS) completely initializes the memory of newly constructed packets, hence, thetransfer time has to scale with the packet size. For the host and the distributed domain YARP and ROS perform very similar as both communicate overTCP sockets (side note: for YARP, because of instabilities, we could use the potentially more efficient mutlicast and shared memory modes) . In caseof the host domain and large packets (> 1MB) they even reach almost the performance of the shared memory based transport of aRD showing that theLinux loopback sockets are very efficient. In all tests the performance of Orocos was worst, although we always tried the optimal parameters. We suspectthat this comes due to the additional abstraction layer with ACE/TAO in its communication stack. For ROS we found another severe quirk in the host anddistributed domain and packet sizes of 10KB to 100KB. There the round-trip time dramatically increases 100x. A further analysis (showed that this effectdisappears completely when adding a pause of at least 100ms between each round of the test (see the inset in the 1-client plot depicting the round-triptime over the pause time for 1KB packet) . This means, ROS is not really stress resistent.

with 100 clients running in the kHz range. Even for thedistributed domain the worst-case round-trip latencies are nolonger than 500µs.

IV. CONCLUSIONS

We presented the design considerations and implementa-tion details of the new highly performant, realtime capable,minimalistic and simple communication layer of our aRDxsoftware framework. In an in-depth benchmarking on Linuxof the raw communication performance of aRDx and thepopular robotic software frameworks ROS, YARP, Orocosand aRD it was shown that aRDx performs excellent in bothextreme performance aspects, namely latency and bandwidth,and partially dramatically outperforms the other frameworks.In addition due to the ”stress” character of our tests we coulduncover a number of severe quirks in all other frameworks.

Running on QNX, aRDx provides hard realtime perfor-mance even for distributed applications.

aRDx is already successfuly in use on our advanced andcomplex humanoid robot Agile Justin. In future publicationswe will describe its other, high level parts, like the dynamicand flexible but less performant communication layer or theadvanced mechanisms for startup and shutdown of largedistributed applications.

REFERENCES

[1] PR2 - personal robot 2. [Online]. Available:http://www.willowgarage.com

[2] M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs,E. Berger, R. Wheeler, and A. Ng, “Ros: an open-source robot oper-ating system,” in Proceedings of the Open-Source Software workshopat the International Conference on Robotics and Automation (ICRA),2009.

[3] icub. [Online]. Available: http://www.icub.org

Page 29: One for (Almost) All: Using a Modern Programmable ...robotics.unibg.it/tcsoft/sdir2013/slides/baeuml.pdf · Demands of Modern Software Systems • execution performance • interpreters

Conclusions

• modern high level languages (beyond Python or Ruby) have much to offer

• functional programming, advanced static type systems, performance, DSELs ...

• allows radical new system architecture with One Language for (Almost) All• our new aRDx framework successfully follows this philosophy

• many interesting candidate languages for robotics: Haskell, Scala, Erlang, Clojure, Schemes (Gambit, Racket, ...), OCaml, ...

• Tip:

Roboticists, go to language wonderland!

Prepared exclusively for Berthold Baeuml

B. A. Tate. “Seven Languages in Seven Weeks”, The Pragmatic Programmer, 2010.