smart camera systems in the context of mobile … · 2013-06-11 · smart camera systems in the...
TRANSCRIPT
University of Hamburg
MIN Faculty
Department of Informatics
11. July 2013
Smart Camera Systems in the Context of MobileService Robots - Latest Results
TAMS Oberseminar SoSe 2013
Hannes Bistry
University of HamburgFaculty of Mathematics, Informatics and Natural SciencesDepartment of Informatics
Technical Aspects of Multimodal Systems
11 July 2013
TAMS 1
University of Hamburg
MIN Faculty
Department of Informatics
11. July 2013
Outline
1. Introduction and Motivation
2. ROS integration
3. Overhead of modular software frameworks
4. OpenCL for future types of smart cameras
5. Conclusions
TAMS 2
University of Hamburg
MIN Faculty
Department of Informatics
1 Introduction and Motivation 11. July 2013
My research work on one slide
Overall Goal:Evaluate Smart Camera Systems for usage on Service Robots
I create a software architecture for integration
I implement image processing functions and scenarios
I use an existing Smart Camera as a testing platform
I evaluate the performance, figure out the benefits anddrawbacks
I draw the conclusions in terms ofI assess usefulness of current camera systemI define requirements on future types of intelligent camera
systems
TAMS 3
University of Hamburg
MIN Faculty
Department of Informatics
1 Introduction and Motivation 11. July 2013
Enhancing Image Acquisition with Smart Cameras
Definition of a smart-camera:
I digital cameras with integrated computing capabilities
I integration of CPU, DSP or programmable hardware
I standard Ethernet interface - no need for special hardware
Advantages:
I image processing directly on the camera
I transmit image information / regions of interest instead ofimage data
I reduce amount of data
I reduce load on control PC of the robot
TAMS 4
University of Hamburg
MIN Faculty
Department of Informatics
1 Introduction and Motivation 11. July 2013
Modular Software Architecture
I each processing step is implemented as one element(back-end: GStreamer)
I efficient development and testing of processing strategiesI plugging together elements instead of low-level-programmingI exchange of single elements possible (portability)I elements can be reused in different context
I timing analysis (all systems synchronized by NTP)I TCP-elements allow splitting pipelines across networkI system can be applied to
I Smart Camera systemsI object detection/robot graspingI Cloud Computing
src sink sinksrc
filter sink_elementsource_element
TAMS 5
University of Hamburg
MIN Faculty
Department of Informatics
1 Introduction and Motivation 11. July 2013
Results of prior research
Benchmarks on Basler eXciteI slower than a desktop PCI an appropriate distribution of functions can nevertheless lead
to improvements concerning latency and CPU load on thehost system
ROI transmission based on face localizationI smart camera is used to scale down imagesI extracting ROIs on smart camera reduces network load and
latency
Distributed Object detectionI feature vector (about 50 kB) can be sent to many systems in
the working environment of a robotI speedup by searching for many objects in parallel
TAMS 6
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
Why a ROS GStreamer interface is necessary
I using camera with GStreamer drivers in ROS
I use features of GStreamer (compression, network datatransfer)
For this work, it is especially important to:
I integrate smart cameras in ROS
I use implemented algorithms for object detection in ROS
I compare efficiency of developed method to that of ROS
TAMS 7
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
ROS Integration
I existing ROS-GStreamer interface “gscam“
I implemented as a standalone ROS-Node
I GStreamer Pipeline is configures in the environment variable“GSCAM CONFIG“
TAMS 8
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
Drawbacks of the current integration:
I restricted to RGB, 24 bpp format
I fixated to ROS-topic
I inefficient due to unnecessary copy operations
I no timestamps exchanged
I supports only one direction: GStreamer → ROS
I run-time control of parameters not possible
Thus the ROS-GStreamer integration was redesigned from thescratch.
TAMS 9
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
New ROS Integration - Concept
ROS Support implemented as a set of GStreamer plugins:
I ROSSink: Publish arbitrary GStreamer video streams in ROS
I ROSSrc: Subscribe to a ROS Topic and “publish“ imagesinside GStreamer
I ROSParam: Exports parameters of a GStreamer Pipeline toROS parameter server
I ROSSiftfolder: SIFT-based Object detection, publishestransformations in ROS
Wherever possible, memory buffers are shared between GStreamerand ROS (Zero-Copy).
TAMS 10
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
New ROS Integration - Concept
TAMS 11
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
ROSSink
Publish arbitrary GStreamer video streams in ROS.
gst−launch videotestsrc ! capsfilter caps=” v i d e o /x−raw−rgb , bpp=24” ! rossink topic=testvideo
Will generate the following topics:
/testvideo/camera_info/testvideo/image_raw/testvideo/image_raw/compressed/testvideo/image_raw/compressed/parameter_descriptions/testvideo/image_raw/compressed/parameter_updates/testvideo/image_raw/compressedDepth/testvideo/image_raw/compressedDepth/parameter_descriptions/testvideo/image_raw/compressedDepth/parameter_updates/testvideo/image_raw/theora/testvideo/image_raw/theora/parameter_descriptions/testvideo/image_raw/theora/parameter_updates
TAMS 12
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
ROSSink (cont.)
Use cases:
I use arbitrary GStreamer compliant camera in ROS
I integration of smart camera
I publish preprocessed video
I use camera on remote system, using encoders from GStreamer(h264)
Format support: RGB, YUV,gray,jpegTimestamp and metadata conversion
TAMS 13
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
ROSSrc
Subscribe to a ROS Topic and “publish“ images inside GStreamer:
gst−launch rossrc topic=/testvideo/image_raw !ffmpegcolorspace ! ximagesink sync=0
Use cases:
I use ROS compliant cameras in GStreamer (including virtualcameras)
I record ROS image topics (with compression)I drop-in replacement for Image View (ROS)
I Display VGA 60 Hz on Core i5:I GStreamer: 19 % CPU-loadI Image View: 57 % CPU-load
TAMS 14
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
ROSSiftfolder
SIFT-based Object detection, publishes transformations in ROS
I all image files in a folder are detected
I Input: feature-vectorI currently, one dimension is encoded in the filename (like
box100.jpg )I planned feature: database access
I if > 4 matches are found, the 3D pose is calculated
Published topics:
I ROS Vizualization Markers
I tf-Messages from camera frame to object frame
TAMS 15
University of Hamburg
MIN Faculty
Department of Informatics
2 ROS integration 11. July 2013
Example: Displaying Objects with RViz
TAMS 16
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Overhead of modular software frameworks
Aim of the following experiments:
I test the GStreamer-ROS interface
I compare the concept of integrating smart cameras to the“state of the art“ methods of implementing interactingsoftware modules
I answer the question, whether it makes sense to completelyswitch over from GStreamer to ROS (especially on X86 basedcameras)
TAMS 17
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Background - Communication between software modules
If different modules exchange data, we must distinguish betweendifferent Modes.
1. Intra-Thread Communication (fastest, only 1 to 1)
2. Intra-Process-Inter-Thread Communication (fast, 1 to N, onlysingle process, only local)
3. Inter-Process Communication (1 to N, only local)
4. Inter-Process Communication using network protocols (1 toN, distributed systems)
TAMS 18
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarking different communication modes
TAMS 19
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Background - Communication between software modules
Hardware: Intel Core i5 - 3570First test: Intra-Thread Communication FullHD image (6MB)
Latency=0.001 ms
Thus:
I Overhead of Intra-Thread communication in GStreamer canbe neglected in further tests.
I The different methods can be tested “inside“ of the pipelinesin the last slide
In the next test: ROS , ROS (shared memory), GStreamer TCP,GStreamer Multi-Thread
TAMS 20
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarking different communication modes
4 Bytes 1 MB 6 MB0
5
10
15
0.31
2.96
14.83
0.31
2.63
12.6
0.20.74
3.55
0.05 0.05 0.06
Lat
ency
[ms]
ROS ROS(s) TCP MT
Benchmarks of CPU load follow later.TAMS 21
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarks - Comparison to results from another party
Einhorn et al.,MIRA - Middleware for Robotic Applications , IROS 2012
I Similar Results on Core i7 test system for ROS
I GStreamer TCP protocol can probably outperform the state of the artframeworks
I GStreamer multi-threading is extremely fast
TAMS 22
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarks - Results
I state of the art frameworks induce overhead
I where possible, multi-threading should be usedI problems:
I starting and stopping “Nodes“ not possible by defaultI only local communicationI (imho: debugging a standalone executable is easier than
debugging a part of a big system)
Question: Is there a more efficient way of inter-processcommunication?
TAMS 23
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Proof of Concept:
Inter-process Communication using shared memory:
I background: Each process has its own address space
I shared Memory is a mechanism of modern operating systems
I one memory regions can be mapped from different processes
I synchronization by mutexes and semaphores
Implemented GStreamer Elements SHMSrc + SHMSink
I “downstream buffer allocation“
I the previous element in the pipeline writes directly into sharedmemory
TAMS 24
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarking different communication modes
4 Bytes 1 MB 6 MB0
5
10
15
0.31
2.96
14.83
0.31
2.63
12.6
0.20.74
3.55
0.171
5.72
0.12 0.18 0.440.05 0.05 0.06
Lat
ency
[ms]
ROS ROS(s) TCP SHM SHM(s) MTTAMS 25
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarks - CPU-load publisher (base load 25%)
4 Bytes 1 MB 6 MB0
10
20
30
40
50
1.3
11.7
50.5
1
11.1
45.6
0.5
5.7
32.4
0.7
8.3
47
0.7
6.8
39.5
0.4
5.1
25.7
CP
Ulo
adp
ub
lish
er[%
]
ROS ROS(s) TCP SHM SHM(s) MTTAMS 26
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Benchmarks - CPU-load receiver
4 Bytes 1 MB 6 MB0
5
10
15
1.2
4.5
16.6
0.8
4.4
16.3
0.41.3
10.2
0.6 0.7 0.70.6 0.7 0.6
CP
Ulo
adre
ceiv
er[%
]
ROS ROS(s) TCP SHM SHM(s)TAMS 27
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Additional Test: Intel Atom
Test on Intel Atom N270 platform:Sending a 6 MB FullHD image inside ROS: 132 ms
This leads to unacceptable delays/CPU loads.
TAMS 28
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Alternatives inside the ROS-Universe
ROS also supports Shared Memory:
I by defining a new datatype
I no compatibility to existing nodes
ROS introduced so called “Nodelets“:
I zero copy pointer-based operation
I started within one “Node“
I no standalone operation - no inter-process communication -no compatibility to existing nodes
In GStreamer, only the element for data transport needs to beexchanged to support new methods.
TAMS 29
University of Hamburg
MIN Faculty
Department of Informatics
3 Overhead of modular software frameworks 11. July 2013
Consequences
I ROS offers great functionality / support for robot hardware
I drawbacks are not that critical for low bandwidth data
No reason to propose “Yet another robot framework“.Suggestion:
I process high-bandwidth data outside of ROS
I publish the results in ROS
This way, the advantages of both frameworks can be combined.
TAMS 30
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
GPU - Processing
Prior results show that current smart camera systems are to slowto accomplish complex image processing tasks.Drawbacks:
I embedded processors are slower than high performancedesktop PCs
I limited size / heat dissipation capabilities
One way of solving these problems would be to implement parallelcomputing architectures into smart cameras.
Can GPU based computing be integrated into the developedmodular concept?
TAMS 31
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
GPU - Processing
May GPU based processing have advantages over CPU-basedprocessing?
I high performance computingI many shaders work in parallelI extremely high memory bandwidth
I different ways of programming GPU hardwareI GLSL (Low-Level Programming of Shaders)I CUDA (Nvidia)I OpenCL (Apple, Intel, Nvidia, AMD)
I OpenCL is chosen as it is an open standard and widelysupported
TAMS 32
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
GPU - Processing
General Drawback of GPU-based programming:
I Data need to be uploaded / downloaded:
I host memory ↔ video board memory
I but the bandwidth is quite high
I download could be skipped, if data is displayed
I only suitable for algorithms that can be parallelized without alot of data dependencies
TAMS 33
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
GPU - Processing
Integration into the GStreamer framework:
I Upload and Download within each element, no option to passdata to next element.
I Code for GPU from external .cl file
I new algorithms can be used without installationI Elements:
I GstCL (arbitrary functions on gray-scale images)I Clrgbafilter (arbitrary functions on color images)I clcolorspace (colorspace conversion from YUV to RGBA)I clundistort (correction of distortion, including YUV to RGBA
conversion)I clremap4yuv (panorama stitching)
TAMS 34
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
Ximea CURRERA G
I AMD x86-APU (2 CPU cores 1.6 GHz, GPU-Part 80 Shaders500 MHz)
TAMS 35
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
Ximea CURRERA G (cont.)
Features:
I Windows Embedded + Linux
I 2 GB RAM
I sensors WVGA up to 5 MPixel
I Gigabit Ethernet + other interface types
I available mid 2013
I OpenCL compliant
TAMS 36
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
Ximea CURRERA G (cont.)
Unfortunately no sample available: Therefore this camera issimulated in the tests using an AMD GPU.
I Radeon 5450
I 80 shader @ 400 MHz
This allows a rough estimation of the processing power to expectfrom the Ximea CURRERA G.
TAMS 37
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
GPU - Processing
Results for correction of lens distortion, 1384x1032:
RGB YUV0
10
20
30
9
13.1
30.928.9
14.9
34.6
6.78.9
5.3 5.93.2 3
pro
cess
ing
tim
e[m
s]
CPU i5 AMD 5450 NVS295 Quadro 600 Quadro 2000 GTX 670TAMS 38
University of Hamburg
MIN Faculty
Department of Informatics
4 OpenCL for future types of smart cameras 11. July 2013
Results of GPU-Compunting
I middle class GPU can outperform a fast CPU
I CURRERA G will be slower than state of the art hardware
I but it will provide usable framerate
I for comparison: Basler eXcite needs >600 ms for thisoperation.
I therefore it is usable in additional scenarios
I other tests (Sobel,Laplace) confirm these results
TAMS 39
University of Hamburg
MIN Faculty
Department of Informatics
5 Conclusions 11. July 2013
Summary of results
I advanced ROS integrationI more featuresI better efficiency
I benchmarks on interprocess communicationI high overhead in ROSI provided methods how to solve this problem
I benchmarks on OpenCL based image processingI still slower than Desktop hardwareI usable in scenarios where Basler eXcite is too slow
TAMS 40