live demonstration: deep learning-based visual tracking of

1
Live Demonstration: Deep Learning-Based Visual Tracking of Multiple Objects on a Low-Power Embedded System Beatriz Blanco-Filgueira * , Daniel Garc´ ıa-Lesta , Mauro Fern´ andez-Sanjurjo , V´ ıctor M. Brea § and Paula L´ opez Centro Singular de Investigaci´ on en Tecnolox´ ıas da Informaci´ on, Universidade de Santiago de Compostela ua de Jenaro de la Fuente Dom´ ınguez, 15782 Spain Email: * beatriz.blanco.fi[email protected], [email protected], [email protected], § [email protected], [email protected] Abstract—Multiple object visual tracking of real time detected objects using a low-power embedded solution is shown. The proposal is implemented on a NVIDIA Jetson TX2 development kit demonstrating the feasibility of deep learning techniques for IoT and mobile edge computing applications. I. FUNDAMENTALS A proposal for multiple object tracking was developed making use of the Hardware-Oriented PBAS (HO-PBAS) algorithm [1] to detect moving objects integrated with the GOTURN CNN based tracker [2]. Instead of more robust techniques to solve the one object to multiple objects ex- tension, which sometimes suffer from low speed [3], a more straightforward solution was used to develop an end-to-end solution over a low-power embedded platform for real time performance [4]. II. DEMONSTRATION SETUP The experimental set-up during execution is shown in Fig. 1. A 3S LiPo battery is used to power a Jetson TX2 development kit [5]. The platform is remotely controlled by a tablet using WiFi connection, which is also used to visualize live perfor- mance. NVIDIA Jetson TX2 is an AI SoC for inference at the edge, which is offered as a 50x87 mm and 85 g weight single board computer and also in a ready to use development kit. The latter includes a 5MP CSI camera, WLAN and Bluetooth connectivity and several peripherals, thus development from beginning to end can be accomplished from the early stage. It is powered by NVIDIA Pascal GPU architecture and it consumes between 5 and 15 W, far below the discrete GPUs, which swing between 150 and 250 W. Consequently, Jetson TX2 was chosen due to its power-efficiency, small dimensions and high throughput. III. VISITOR EXPERIENCE The camera of the Jetson TX2 development kit captures a scene while the multiple object visual tracking algorithm is running on the platform. The camera output and the tracking result are displayed on a tablet which can be used by the visitor to monitor live processing capability. Thus, visitors can experience the complete end-node performance which demonstrates the feasibility of deep learning techniques for Fig. 1. Experimental set-up: live visual tracking is displayed on a tablet, which also controls the battery powered Jetson TX2 development kit used to capture and process the scene. real time decision making based on visual tasks in a embedded solution which consumes less than 15 W. ACKNOWLEDGMENT This work has been partially supported by the Spanish gov- ernment project RTI2018-097088-B-C32 MICINN (FEDER), the Conseller´ ıa de Cultura, Educaci´ on e Ordenaci´ on Univer- sitaria (accreditation 2016-2019, ED431G/08, and reference competitive group 2017-2020, ED431C 2017/69) and the European Regional Development Fund (ERDF). REFERENCES [1] D. Garc´ ıa-Lesta, P. L´ opez, V. M. Brea, and D. Cabello, “In-pixel analog memories for a pixel-based background subtraction algorithm on CMOS vision sensors,” International Journal of Circuit Theory and Applications, vol. 46, no. 9, pp. 1631–1647, 2018. [2] D. Held, S. Thrun, and S. Savarese, “Learning to Track at 100 FPS with Deep Regression Networks,” in European Conference on Computer Vision (ECCV), 2016, pp. 749–765. [3] M. Keuper, S. Tang, Y. Zhongjie, B. Andres, T. Brox, and B. Schiele, “A multi-cut formulation for joint segmentation and tracking of multiple objects,” arXiv:1607.06317, 2016. [4] B. Blanco-Filgueira, D. Garc´ ıa-Lesta, M. Fern´ andez-Sanjurjo, V. M. Brea, and P. L´ opez, “Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications,” arXiv:1808.01356, 2018. [5] “NVIDIA Jetson. The embedded platform for autonomous every- thing.” [Online]. Available: https://www.nvidia.com/en-us/autonomous- machines/embedded-systems-dev-kits-modules/

Upload: others

Post on 05-Jul-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Live Demonstration: Deep Learning-Based Visual Tracking of

Live Demonstration: Deep Learning-Based VisualTracking of Multiple Objects on a Low-Power

Embedded System

Beatriz Blanco-Filgueira∗, Daniel Garcıa-Lesta†, Mauro Fernandez-Sanjurjo‡, Vıctor M. Brea§ and Paula Lopez¶Centro Singular de Investigacion en Tecnoloxıas da Informacion, Universidade de Santiago de Compostela

Rua de Jenaro de la Fuente Domınguez, 15782 SpainEmail: ∗[email protected], †[email protected], ‡[email protected], §[email protected], ¶[email protected]

Abstract—Multiple object visual tracking of real time detectedobjects using a low-power embedded solution is shown. Theproposal is implemented on a NVIDIA Jetson TX2 developmentkit demonstrating the feasibility of deep learning techniques forIoT and mobile edge computing applications.

I. FUNDAMENTALS

A proposal for multiple object tracking was developedmaking use of the Hardware-Oriented PBAS (HO-PBAS)algorithm [1] to detect moving objects integrated with theGOTURN CNN based tracker [2]. Instead of more robusttechniques to solve the one object to multiple objects ex-tension, which sometimes suffer from low speed [3], a morestraightforward solution was used to develop an end-to-endsolution over a low-power embedded platform for real timeperformance [4].

II. DEMONSTRATION SETUP

The experimental set-up during execution is shown in Fig. 1.A 3S LiPo battery is used to power a Jetson TX2 developmentkit [5]. The platform is remotely controlled by a tablet usingWiFi connection, which is also used to visualize live perfor-mance. NVIDIA Jetson TX2 is an AI SoC for inference at theedge, which is offered as a 50x87 mm and 85 g weight singleboard computer and also in a ready to use development kit.The latter includes a 5MP CSI camera, WLAN and Bluetoothconnectivity and several peripherals, thus development frombeginning to end can be accomplished from the early stage.It is powered by NVIDIA Pascal GPU architecture and itconsumes between 5 and 15 W, far below the discrete GPUs,which swing between 150 and 250 W. Consequently, JetsonTX2 was chosen due to its power-efficiency, small dimensionsand high throughput.

III. VISITOR EXPERIENCE

The camera of the Jetson TX2 development kit captures ascene while the multiple object visual tracking algorithm isrunning on the platform. The camera output and the trackingresult are displayed on a tablet which can be used by thevisitor to monitor live processing capability. Thus, visitorscan experience the complete end-node performance whichdemonstrates the feasibility of deep learning techniques for

Fig. 1. Experimental set-up: live visual tracking is displayed on a tablet,which also controls the battery powered Jetson TX2 development kit used tocapture and process the scene.

real time decision making based on visual tasks in a embeddedsolution which consumes less than 15 W.

ACKNOWLEDGMENT

This work has been partially supported by the Spanish gov-ernment project RTI2018-097088-B-C32 MICINN (FEDER),the Consellerıa de Cultura, Educacion e Ordenacion Univer-sitaria (accreditation 2016-2019, ED431G/08, and referencecompetitive group 2017-2020, ED431C 2017/69) and theEuropean Regional Development Fund (ERDF).

REFERENCES

[1] D. Garcıa-Lesta, P. Lopez, V. M. Brea, and D. Cabello, “In-pixel analogmemories for a pixel-based background subtraction algorithm on CMOSvision sensors,” International Journal of Circuit Theory and Applications,vol. 46, no. 9, pp. 1631–1647, 2018.

[2] D. Held, S. Thrun, and S. Savarese, “Learning to Track at 100 FPS withDeep Regression Networks,” in European Conference on Computer Vision(ECCV), 2016, pp. 749–765.

[3] M. Keuper, S. Tang, Y. Zhongjie, B. Andres, T. Brox, and B. Schiele,“A multi-cut formulation for joint segmentation and tracking of multipleobjects,” arXiv:1607.06317, 2016.

[4] B. Blanco-Filgueira, D. Garcıa-Lesta, M. Fernandez-Sanjurjo, V. M. Brea,and P. Lopez, “Deep Learning-Based Multiple Object Visual Tracking onEmbedded System for IoT and Mobile Edge Computing Applications,”arXiv:1808.01356, 2018.

[5] “NVIDIA Jetson. The embedded platform for autonomous every-thing.” [Online]. Available: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules/