system support for efficient multi-resolution visual

“System support for efficient multi-resolution visual computing on low power embedded

systems”

Robert LiKamWa[Phoenix Area Group] - September 10, 2020

tinyML Talks Sponsors

Additional Sponsorships available – contact [email protected] for info

mailto:[email protected]

| Confidential Presentation ©2020 Deeplite, All Rights Reserved

VISIT bit.ly/Deeplite FOR MORE INFO

WE USE AI TO MAKE OTHER AI FASTER, SMALLER ANDMORE POWER EFFICIENT

Automatically compress SOTA models like MobileNet to <200KB withlittle to no drop in accuracy for inference on resource-limited MCUs

Reduce model optimization trial & error from weeks to days usingDeeplite's design space exploration

Deploy more models to your device without sacrificing performance orbattery life with our easy-to-use software

Copyright © EdgeImpulse Inc.

TinyML for all developers

Get your free account at http://edgeimpulse.com

Test

Edge Device Impulse

Dataset

Embedded and edge

compute deployment

options

Acquire valuable

training data securely

Test impulse with

real-time device

data flows

Enrich data and train

ML algorithms

Real sensors in real time

Open source SDK

http://edgeimpulse.com/

Maxim Integrated: Enabling Edge Intelligence

Sensors and Signal Conditioning

Health sensors measure PPG and ECG signals critical to understanding vital signs. Signal chain products enable measuring even the most sensitive signals.

Low Power Cortex M4 Micros

The biggest (3MB flash and 1MB SRAM) and the smallest (256KB flash and 96KB SRAM) Cortex M4 microcontrollers enable algorithms and neural networks to run at wearable power levels

Advanced AI Acceleration

AI inferences at a cost and power point that makes sense for the edge. Computation capability to give vision to the IoT, without the power cables. Coming soon!

Wide range of ML methods: GBM, XGBoost, Random

Forest, Logistic Regression, Decision Tree, SVM, CNN, RNN,

CRNN, ANN, Local Outlier Factor, and Isolation Forest

Easy-to-use interface for labeling, recording, validating, and

visualizing time-series sensor data

On-device inference optimized for low latency, low power

consumption, and a small memory footprint

Supports Arm® Cortex™- M0 to M4 class MCUs

Automates complex and labor-intensive processes of a

typical ML workflow – no coding or ML expertise required!

Industrial Predictive Maintenance

Smart Home

Wearables

Qeexo AutoML for Embedded AIAutomated Machine Learning Platform that builds tinyML solutions for the Edge using sensor data

Automotive

Mobile

IoT

QEEXO AUTOML: END-TO-END MACHINE LEARNING PLATFORM

Key Features Target Markets/Applications

For a limited time, sign up to use Qeexo AutoML at automl.qeexo.com for FREE to bring intelligence to your devices!

https://automl.qeexo.com/

Extensive, highly-optimized feature spaces

Super-compact code for MCUs and Gateways

Sensor selection and placement analysis

AI-driven component specs

Automated data quality checks

Data collection, augmentation & labeling services

No open source - clean licensing

Next-Generation AI Tools for

Product Development

Get started w/ a special tinyML Talks offer for corporate customers: https://reality.ai/get-started

$

https://reality.ai/get-started

SynSense (formerly known as aiCTX) builds ultra-low-power(sub-mW) sensing and inference hardware for embedded, mobile and edge devices. We design systems for real-time always-on smart sensing, for audio, vision, bio-signals and

more.

https://SynSense.ai

Next tinyML Talks

Date Presenter Topic / Title

Tuesday,September 15

Hiroshi DoyuSenior researcher,Ericsson Research

Vikrant TomarFounder & CTO, Fluent.ai Inc.Sam MyerLead ML Developer, Fluent.ai Inc.

TinyML as-a-Service - Bringing ML inference to the deepest IoT Edge

Speech Recognition on low power devices

Webcast start time is 8 am Pacific timeEach presentation is approximately 30 minutes in length

Please contact [email protected] if you are interested in presenting


Robert LiKamWa Robert LiKamWa is an assistant professor at Arizona State University, appointed in the School of Arts, Media and Engineering (AME) and the School of Electrical, Computer and Energy Engineering (ECEE). At ASU, LiKamWa directs Meteor Studio (http://meteor.ame.asu.edu), which explores the research and design of software and hardware for mobile Augmented Reality, Virtual Reality, Mixed Reality, and visual computing systems, and their ability to help people tell their stories. Prior to coming to ASU, LiKamWa completed his bachelor's, master's and doctoral degrees at Rice University in the Department of Electrical and Computer Engineering.LiKamWa has received an NSF CAREER Award, a Google Faculty Research Award, and a Best Paper Award at MobiSys 2013.

http://meteor.ame.asu.edu/

System support for efficient multi-resolution

visual computing on mobile systems (TinyML, 9/10/2020)


visual computing on mobile systems

Robert LiKamWa

http://meteor.ame.asu.edu

Arizona State University



Slide 12

Vision is power hungry

especially at high resolutions



Slide 13Vision doesn’t always need high-resolution images

0.1 MP suffices for object recognition

(e.g., GoogleNet)

Translational error



Slide 14We can exploit this if image sensingis energy-proportional

resolution

Energy-

per-frame

(1 MP, 3 fps)

(0.1 MP, 3 fps)

0.1 MP

1 MP



Slide 15

But it’s not.

320 mW280mW

1 MP

0.1 MP

We can exploit this if image sensingis energy-proportional

Energy characterization and optimization of

image sensing toward continuous mobile vision [MobiSys ’13]

resolution

Energy-

per-frame



Slide 16

Camera

Module

Programmable

Clock (I2C)

NI DAQ Device

Power Rail

Resistors





Slide 17

Image sensor power breakdown

Analog

DigitalPLL

Active Period

(Pixel readout)

Idle Period

(Frame spacing)





Slide 18

Energy per Frame





Slide 19

Low Resolution

High Resolution





Slide 20Idle power limits energy-proportionality

resolution

AveragePower

(1 MP, 3 fps)(0.1 MP, 3 fps)

320 mW

280mW

1 MP

0.1 MP





Slide 21

Power(mW)

Time (s)

250

200

150

50

100

00 0.2 0.4

Driver-based power optimization:

(1) Aggressive power management

Power(mW)

Time (s)

250

200

150

50

100

00 0.4

exposure

time





Slide 22

Driver-based power optimization:

(2) Pixel clock frequency optimization

Faster clock

Slower clock Optimal

frequency Pixel count

Exposure

Time





Slide 23Energy-proportionalitybefore driver-based management

Power

(mW)

Frame rate

(FPS)Pixel Count

(x105)





Slide 24Energy-proportionalityafter driver-based management

Power

(mW)

Frame rate

(FPS)Pixel Count

(x105)





Slide 25

Power

(mW)

Frame rate

(FPS)Pixel Count

(x105)

Energy-proportionalityafter driver-based management

20 mW at

(0.1 MP, 3 FPS)

280 mW at

(0.1 MP, 3 FPS)



Slide 26

Vision is power hungry,especially at high

resolutions

70%

Energy consumption of augmented reality marker detection

On Jetson Tegra X2 board



Slide 27

However, resolution reconfiguration incurs latency penalty

27

Notice the visible gap during reconfiguration

Linux V4L2 (Video4Linux2) + Jetson TX2 Android 8.0 + Nexus 5X iOS 12 + iPhone X



Banner: An Image Sensor Reconfiguration Framework

for Seamless Resolution-based Tradeoffs [MobiSys ‘19]



Slide 28

Where does resolution reconfiguration latency come from?

28

Hardware? Operating system?





Slide 29

Hardware is not the culprit

• Hardware register values are effective by the next frame

29

[From AR0330 datasheet]





Slide 30In the operating system, resolution reconfiguration undergoes a sequential procedure inside the media framework which requires the application to invoke several expensive system

calls

30

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Stop streaming

Release buffers

No

Yes

Application sends a resolution request

ioctl(VIDIOC_STREAMOFF) turns off current streams

munmap and free buffers

ioctl(VIDIOC_S_FMT) sets sensor output format

ioctl(VIDIOC_REQBUFS) and mmap new sets of buffers

ioctl(VIDIOC_STREAMON) start new streams



Slide 31

31

1: Preserve the pipeline of existing frames

Aspirations for

a reconfigurable media framework

2: Change resolution immediately, effective in the next capture

3: Minimize format synchronizations across the video system stack





Slide 32

We introduce the Banner media framework

Applications can reconfigure sensor resolution

through only one ioctl() call with Banner.

32

Media framework

Legacy camera host driver

Imagesensor

Video buffer

Banner framework

Banner camera host driver

Imagesensor

Video buffer

Applications Applications





Slide 33Banner avoids repeated reconfiguration

procedure

33

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Stop streaming

Release buffers

No

Yes

ReconfigurationIn legacy V4L2

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Set sensor format

No

Yes

Process image

Reconfigurationin

Banner

Parallelreconfiguration

Format-obliviousmemory management





Slide 34

Parallel reconfiguration

34

𝑇𝑏𝑢𝑑𝑔𝑒𝑡 = 𝑇𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 − 𝑇𝑐𝑎𝑝𝑡𝑢𝑟𝑒

Banner Reconfiguration

Thread

Execute sensor reconfiguration

• Enact thread-level concurrency

• Time request to happen within reconfiguration timing budget





Slide 35

Format-oblivious memory management

• One time buffer allocation

• Format-oblivious frame delivery

35

480p buffer 480p buffer

Format-oblivious in Banner

Format-awarein legacy V4L2

Resolution request

1080p buffer

1080p frame

1080p buffer

1080p frame

1080p buffer

1080p frame

1080p buffer

1080p frame

1080p buffer 1080p buffer

1080p frame

480p frame 480p frame

480p frame





Slide 36Banner avoids repeated reconfiguration

procedure

36

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Stop streaming

Release buffers

No

Yes

ReconfigurationIn legacy V4L2

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Set sensor format

No

Yes

Process image

Reconfigurationin Banner







Slide 37Banner realizes seamless

resolution reconfiguration

37

BannerLegacy V4L2





Slide 38Driver-based management for energy-proportional image capture

Power

(mW)

Frame rate

(FPS)Pixel Count

(x105)

Banner media framework for seamless resolution reconfiguration

Open device

Set sensor format

Request/map buffers

Start streaming

Process image

Resolution request?

Set sensor format

No

Yes

Process image

Reconfigurationin Banner





Slide 39

Ongoing efforts in multi-resolution visual computing systems

39

Reconfigurable resolution framework support

Fine-grained reconfigurability• Explore multi-resolution

not just across frames, but within frames

• Explore variable temporal resolution

Contextual use cases• Augmented Reality/Mixed Reality

foveated rendering, foveated sensing

• Adaptive neural networks to work on variable-size image streams

Software-defined imaging systems• Variable bit depth• Region-of-interest• Sensor control loops• Cloud integration (5G)

http://meteor.ame.asu.edu

tinyML Talks Sponsors

Additional Sponsorships available – contact [email protected] for info


Copyright Notice

This presentation in this publication was presented as a tinyML® Talks webcast. The content reflects the opinion of the author(s) and their respective companies. The inclusion of presentations in this publication does not constitute an endorsement by tinyML Foundation or the sponsors.

There is no copyright protection claimed by this publication. However, each presentation is the work of the authors and their respective companies and may contain copyrighted material. As such, it is strongly encouraged that any use reflect proper acknowledgement to the appropriate source. Any questions regarding the use of any materials presented should be directed to the author(s) or their companies.

tinyML is a registered trademark of the tinyML Foundation.

www.tinyML.org

system support for efficient multi-resolution visual

Documents