"evolving algorithmic requirements for recognition and classification in augmented...
TRANSCRIPT
Copyright © 2014 CogniVue Corporation 1
Simon Morris /Tom Wilson CogniVue
May 29, 2014
Evolving Algorithmic Requirements for
Recognition and Classification in
Augmented Reality
Copyright © 2014 CogniVue Corporation 2
• The challenge of “always-on” vision for augmented reality
in mobile devices: Power for performance.
• Marker-based augmented reality algorithm flow
• Computational loading examples for Marker-based
• “Marker-less” algorithm flow
• Computational loading estimates for Natural Feature
Tracking (NFT)
• Architectural Implications of always-on feature detection,
extraction and tracking on mobile apps processors
Outline
Copyright © 2014 CogniVue Corporation 3
• IoT devices and wearables are full of various
sensors, but the most valuable sensor is missing:
Challenge of Always-On Vision
• Why?
• Heat dissipation, battery
life, cost effectiveness
• The ability to see
Copyright © 2014 CogniVue Corporation 4
• Why Mobile? — Everything, but kitchen sink
• Gap between always-on vision with existing processing
technology
1. Power
• 30 to max 60 mins with 2000mAh battery
• Eyewear like Google glass has <1 hr video and no vision
2. Performance & Cost
• Expensive use of apps processor resources - need has quad core
Cortex, GPU, DSP, video codec.
• Not efficient — vision operations divided among cores
• Best case stereo VGA, need ~30x this for stereo 1080p 60 fps
Challenge of Always-On Vision
Copyright © 2014 CogniVue Corporation 5
High Level AR Processing Flow
Taken from Fernandez, Orduna, Morillo (2011)
~50ms
<10ms
Latency (ms)
Acquisition Detection Rendering
iPhone4 17.66 23.3
Copyright © 2014 CogniVue Corporation 6
• Markers — simply a special case of a feature
• Well defined to assist in rapid pose
calculation
• High contrast enables easier
detection
• Most markers are simple black and
white squares
• Four known points are important to allow for
subsequent distortion correction and marker
decoding.
Marker Design
Copyright © 2014 CogniVue Corporation 7
• With Marker-Based AR only need low processor demand CV
functions for camera pose estimation
• Toolkits are available to build these applications (e.g. for
iOS see http://www.packtpub.com/article/marker-based-augmented-reality-
on-iPhone-or-iPad)
Marker-Based Camera Pose
Grayscale Binarization Contours Candidates Distortion Correction
Images from En-Co Software Ltd
Copyright © 2014 CogniVue Corporation 8
• Detection & tracking on device must be real-time: <10ms
• Ex. computational loading (Fernandez, Orduna, Morillo 2011):
• Higher performance chips needed to achieve real time
performance, but also keep significantly lower power
• CogniVue G2-APEX at 600MHz can process at <2ms @ 1MP (
real time) for mWatts
Marker-based Processor Loading
Examples
Copyright © 2014 CogniVue Corporation 9
• Marker-based unsuitable in many scenarios (e.g. outdoor
tracking)
• Markerless tracking depends on natural features vs fiducial
markers.
• AR system needs to use some other tracking method
• Sensors for tracking (e.g. GPS); or
• Visual tracking method to estimate camera’s pose (camera-based
tracking, optical tracking, or natural feature tracking)
• Hybrid; e.g. GPS and MEMS gyroscope for position and visual
tracking for orientation
• Tracking and registration becomes more complex with
Natural Feature Tracking (NFT)
• Markerless AR apps with NFT will be widely adopted
Why Markerless?
Copyright © 2014 CogniVue Corporation 10
• Different approaches to pose estimation in a markerless
application
• Requires Feature Detection, Extraction and Matching
How is NFT Different?
Grayscale Keypoint Detection
Feature Extraction
Feature Matching
Pose Estimation
Copyright © 2014 CogniVue Corporation 11
• Interest point detectors:
• Before tracking, features or key points must be detected
• Examples: Harris Corner Detection, GFTT, FAST
• FAST has been preferred for mobile apps as it requires
less processor performance, but not necessarily best for
accuracy and precision
• Feature descriptors for matching
• SIFT, SURF, ORB, HIP
• Mobile AR also involves Pose Estimation (e.g. with RANSAC)
Feature Detection and Extraction
Copyright © 2014 CogniVue Corporation 12
• SIFT is good “acid test” for
detection/extraction performance
• CPU alone experiences long
processing latency with SIFT
• An example of GPU acceleration
shows about a 5x to 10x improvement
• CogniVue G2-APEX core shows ~50x
improvement and is >100x in terms of
performance per power
• G3-APEX will provide a further 4x-8x
Mobile Performance: SIFT
Detection/Extraction
*CPU figures Adapted from Hudelist, Corarzan, Schoeffman, 2014 (corrected to VGA)
**GPU + CPU figures adapted from Rister, Wang, Wu and Cavallaro, 2013 (corrected to VGA)
***Estimate with APEX-1284 configuration @ 600MHz
Device SIFT
CPU*
iPadAir 877
iPad4 1379
iPad3 3518
iPadMini1 3586
iPad2 3515
iPhone4s 4320
iPhone5 1474
iPhone5S 1028
GPU + CPU
SnapDragon S4 404
Nexus 7 472
Galaxy Note II 528
Tegra 250 508
CogniVue ICP ***
G2-APEX ICP 8.6
G3-APEX ICP ~2
Copyright © 2014 CogniVue Corporation 13
• HOG feature descriptor is very similar to SIFT (HOG inspired by SIFT)
• Similar computational complexity as a feature descriptor to SIFT
• “UncannyVision” has shown 8x improvement in HOG performance after
optimizing OpenCV code
• G2-APEX has significant performance per power advantage even over
highly optimized implementation
Impact of Optimization: HOG Example
Core and Implementation HOG + SVM VGA (ms)
Cortex A9 1.2 GHz OpenCV 2320
Cortex A9 1.2GHz Optimized 340
Cortex A15 1.2 GHz OpenCV 1265
Cortex A15 1.2 GHZ Optimized 135
G2-APEX ICP 600MHz 10.5
G3-APEX ICP 600MHz ~2.5
Copyright © 2014 CogniVue Corporation 14
• Real time performance is ~ 50ms from image acquisition to
display (glasses), therefore:
Feature detection, tracking, matching <10ms and <5ms
for low power operation
• NFT processing for “always-on” mobile AR needs >100x
improvement for always-on performance/ power for >1MP
• Wearable AR applications need this performance to make
power-efficient always-on AR and vision applications
• G2-APEX ICP technology offers the necessary acceleration
with G3-APEX ICP cores to bring additional 4x-8x
Architectural Implications