deep learning to identify and predict objects in the ...[1] m. buckler, p. bedoukian, s. jayasuriya,...

5
[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,” in 2018 ACM/IEEE 45th Annual Int. Sym. on Comp. Architecture, 2018, pp. 533–546. [2] Y. Zhu, A. Samajdar, M. Mattina, and P. Whatmough, “Euphrates: Algorithm-soc co-design for low-power mobile continuous vision,” arXiv:1803.11232, 2018. [3] G. Evans, J. Miller, M. I. Pena, et al, “Evaluating the microsoft hololens through an augmented reality assembly application,” in Degraded Environments: Sensing, Processing, and Display, vol. 10197. [4] X. Chen and A. Gupta, “An implementation of faster RCNN with study for region sampling,” arXiv:1702, 2017. [5] F. N. Iandola, S. Han, M. W. Moskewicz, et al, “Squeezenet: Alexnet- level accuracy with 50x fewer parameters and¡ 0.5 mb model size,” arXiv:1602.07360, 2016. Deep Learning to Identify and Predict Objects in the Environment MOTIVATION Utilize advances in deep learning to assist in the ubiquitous tasks of memorization of object placement Employ advanced power savings techniques to drastically improve feasibility in low powered devices PROJECT AIM This project will focus on the application of these models, the feasibility and implementation of deep learning tasks on low powered devices (Microsoft HoloLens), and future applications of this system. METHOD SERVER RESULTS This program is funded by NSF CISE award 1659871 Sensor Signal and Information Processing Center https://sensip.asu.edu 1 School of Arts, Media and Engineering (AME), Arizona State University 2 School of ECEE, Arizona State University. REFERENCES PROBLEM STATEMENT QUALITATIVE RESULTS We found YOLOv3 to fit the project needs most effectively (although Tiny YOLO for serverless solutions may be more feasible, although the issue of accuracy still remains) The client software implementation used very little hardware resources, future implementations may include ways to leverage this hardware to expand existing capabilities YOLOv3: model for real time object classification Flask: client/server integration and secure data transfer NVIDIA CUDA: server side GPU Acceleration 1.067643 0.152976 17.192 0 2 4 6 8 10 12 14 16 18 20 OpenCV + DNN Darknet (CPU) Darknet (CPU) TIME TO FRAME COMPLETION (SECONDS) OBJECT DETECTION FRAMEWORK Time to Complete Standard Test Image YOLOv3 To create a platform that assists individuals in locating objects in their immediate environments via computer vision and mobile systems CHALLENGES Quickly detecting objects in the environment Doing so efficiently and reliably both in respect to client and server operations Leveraging available hardware to accomplish the task REAL WORLD OBJECT DETECTION

Upload: others

Post on 25-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning to Identify and Predict Objects in the ...[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,”in

[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2:Exploiting temporal redundancy in live computer vision,” in 2018ACM/IEEE 45th Annual Int. Sym. on Comp. Architecture, 2018, pp.533–546.

[2] Y. Zhu, A. Samajdar, M. Mattina, and P. Whatmough, “Euphrates:Algorithm-soc co-design for low-power mobile continuous vision,”arXiv:1803.11232, 2018.

[3] G. Evans, J. Miller, M. I. Pena, et al, “Evaluating the microsofthololens through an augmented reality assembly application,” inDegraded Environments: Sensing, Processing, and Display, vol.10197.

[4] X. Chen and A. Gupta, “An implementation of faster RCNN with studyfor region sampling,” arXiv:1702, 2017.

[5] F. N. Iandola, S. Han, M. W. Moskewicz, et al, “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and¡ 0.5 mb model size,”arXiv:1602.07360, 2016.

Deep Learning to Identify and Predict Objects in the Environment

MOTIVATION

❑ Utilize advances in deep learning to assist in the ubiquitous tasks of memorization of object placement

❑ Employ advanced power savings techniques to drastically improve feasibility in low powered devices

PROJECT AIM

❑ This project will focus on the application of these models, the feasibility and implementation of deep learning tasks on low powered devices (Microsoft HoloLens), and future applications of this system.

METHOD SERVER RESULTS

This program is funded by NSF CISE award 1659871

Sensor Signal and Information Processing Centerhttps://sensip.asu.edu

1 School of Arts, Media and Engineering (AME), Arizona State University 2 School of ECEE, Arizona State University.

REFERENCES

PROBLEM STATEMENT

QUALITATIVE RESULTS

❑We found YOLOv3 to fit the projectneeds most effectively (althoughTiny YOLO for serverless solutionsmay be more feasible, although theissue of accuracy still remains)

❑ The client software implementationused very little hardware resources,future implementations may includeways to leverage this hardware toexpand existing capabilities

❑ YOLOv3: model for real time object classification ❑ Flask: client/server integration and

secure data transfer

❑ NVIDIA CUDA: server side GPU Acceleration

1.067643

0.152976

17.192

0

2

4

6

8

10

12

14

16

18

20

OpenCV + DNN Darknet (CPU) Darknet (CPU)

TIM

E T

O F

RA

ME

CO

MP

LE

TIO

N

(SE

CO

ND

S)

OBJECT DETECTION FRAMEWORK

Time to Complete Standard Test Image

YOLOv3

❑ To create a platform that assists individuals in locating objects in their immediate environments via computer vision and mobile systems

CHALLENGES

❑ Quickly detecting objects in the environment❑ Doing so efficiently and reliably both in

respect to client and server operations❑ Leveraging available hardware to

accomplish the task

REAL WORLD OBJECT DETECTION

Page 2: Deep Learning to Identify and Predict Objects in the ...[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,”in

0.67

1 1 1

0.07

1 1 1

0.88

1 1 1

0

0.2

0.4

0.6

0.8

1

1.2

OpenCV + DNN Darknet (CPU) Darknet (GPU) Category 4

TIM

E T

O F

RA

ME

CO

MP

LE

TIO

N

(SE

CO

ND

S)

OBJECT DETECTION FRAMEWORK

Time to Complete Standard Test Image

YOLOv3 Tiny YOLO Faster RCNN

Page 3: Deep Learning to Identify and Predict Objects in the ...[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,”in

0.96

0.68

0.97

0.67

0.07

0.89

0

0.2

0.4

0.6

0.8

1

1.2

YOLOv3 Tiny Yolo Faster RNN

TIM

E T

O F

RA

ME

CO

MP

LE

TIO

N (

SE

CO

ND

S)

OBJECT DETECTION FRAMEWORK

Time to Complete Standard Test Image

Accuracy Time to Frame Completion

Page 4: Deep Learning to Identify and Predict Objects in the ...[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,”in
Page 5: Deep Learning to Identify and Predict Objects in the ...[1] M. Buckler, P. Bedoukian, S. Jayasuriya, and A. Sampson, “EVA2: Exploiting temporal redundancy in live computer vision,”in