gaussian pyramid: comparative analysis of hardware
TRANSCRIPT
ยฉ ANAFOCUS
2007
Gaussian Pyramid: Comparative Analysis of
Hardware Architectures
F. D. V. R. Oliveira1, J. G. R. C. Gomes1, J. Fernรกndez-Berni2, R. Carmona-Galรกn2, R. del
Rรญo2, ร. Rodrรญguez-Vรกzquez2
1Universidade Federal do Rio de Janeiro, Brazil
2Instituto de Microelรฉctronica de Sevilla (IMSE โ CNM)
CSIC โ Universidad de Sevilla, Spain
Workshop on Architecture
of Smart Cameras
Cรณrdoba
Spain
Embedded Vision Processing Architecture
CONVENTIONAL ARCHITECTURE GPUs DSPs FPGAs โฆ
โข Hardware parallelization takes place after having serialized the data previously
โข The fact that the imager requires the physical realization of a 2-D array of elementary cells topographically assigned to the corresponding pixel values can be exploited for early parallelization and distributed memory
Embedded Vision Processing Architecture
PROPOSED ARCHITECTURE
โข Drastic reduction of memory accesses during low-level processing stages, where pixel-wise operations are common
โข Pixel circuitry to accelerate vision algorithms. This circuitry can be implemented in the analog domain for the sake of power and area efficiency
Long-Term Research
Major drawbacks
Reduced fill factor
Large pixel pitch
โ Limited sensitivity
โ Small image size
โ Spatial aliasing
Major achievements
Concept demonstration
Programmable embedded functionalities
Image-to-Decision chain at >1,000fps using 60nW per pixel (industrial chip)
Spatial Gaussian filtering @20nJ/filter
Content-aware HDR acquisition with >145dB intra-frame DR
Major challenges
Implementation of in-pixel embedded functionalities at minimum area cost
Increase hardware-software integration
Drawbacks and Major Challenges
CONVENTIONAL PIXEL
Photo-sensitive
area
pixel pitch ๐ท
Amplification & R
ead
ou
t C
ircu
itry
๐ท๐
Drawbacks and Major Challenges
MULTI-FUNCTIONAL PIXEL
Photo-sensitive
area
pixel pitch ๐ท
Amplification &
Re
ado
ut
Cir
cuit
ry
keep the pixel pitch โ reduce the sensitive area
Photo-sensitive
area
pixel pitch ๐ท
Amplification &
Re
ado
ut
Cir
cuit
ry
Processing circuitry
& m
em
ory
keep the sensitive area โ reduce the image resolution
Photo-sensitive
area
pixel pitch ๐ท + โ
Amplification &
Re
ado
ut
Cir
cuit
ry
Processing circuitry
& m
em
ory
Photo-sensitive
area
Amplification &
Re
ado
ut
Cir
cuit
ry
pixel pitch ๐ท
Drawbacks and Major Challenges
How to minimally impact on the
image quality while maximally
exploiting the advantages of focal-
plane processing
Fundamental Processing Primitive
[J. Campbell and V. Kazantev, โUsing an Embedded Vision Processor to Build an Efficient Object Recognition System,โ White Paper, Synopsis, 2015]
CMOS IMPLEMENTATION GAUSSIAN FILTERING
- Basic operation in many vision pipelines
Fundamental Processing Primitive
CMOS IMPLEMENTATION GAUSSIAN FILTERING
Fundamental Processing Primitive
CMOS IMPLEMENTATION GAUSSIAN FILTERING
Original full-resolution image
Examples:
Sobel operators
Binomial kernel Original half-resolution image Pre-distorted half-resolution image
Original kernel Reduced kernel
Binomial kernel output
Fundamental Processing Primitive
CMOS IMPLEMENTATION GAUSSIAN FILTERING
Fundamental Processing Primitive
CONVENTIONAL VS. FOCAL-PLANE REALIZATION GAUSSIAN FILTERING
Time Analysis
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
CONVENTIONAL VS. FOCAL-PLANE REALIZATION GAUSSIAN FILTERING
Focal-plane processing time
๐๐: size of the Gaussian kernel
๐๐ฟ๐๐ฃ: number of pyramid levels
๐๐ถ๐ : time required to perform one charge redistribution
๐๐ด๐ท๐ถ: time required to perform the analog-to-digital conversion of one pixel
๐๐ด๐ท๐ถ: Number of ADCs
๐ ร ๐: Image resolution
Time Analysis
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
CONVENTIONAL VS. FOCAL-PLANE REALIZATION GAUSSIAN FILTERING
Digital implementation processing time
๐๐๐๐: time required to access a single memory position
๐๐๐ข๐ ๐๐๐: number of parallel accesses to memory
๐๐๐: time required to perform a single MAC operation
Time Analysis
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
CONVENTIONAL VS. FOCAL-PLANE REALIZATION GAUSSIAN FILTERING
Parameters of time analysis equations
Time Analysis
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
Energy Analysis
Trickier highly dependent on the architecture and technology parameters; no global parameter either (clock period in time analysis) Standard circuit blocks
MAC unit SRAM memory cell
Energy Analysis
Equations associated with every aspect of the hardware
Focal-plane energy analysis
๐ธ๐๐๐ฅ๐ถ๐๐๐ก๐ข๐๐ = ๐ถ๐น๐ท โ ๐2๐๐๐
+ ๐ถ๐ ๐ ๐ก โ ๐2๐๐๐
+ ๐ถ๐๐ โ ๐2๐๐๐
๐ธ๐โ๐๐ ๐๐๐๐ ๐ก๐ = (๐๐ฟ๐๐ฃ โ 1)๐๐ โ 2๐ โ 2๐ โ (2๐ถ๐ โ ๐2๐๐๐
)
.
.
.
Digital implementation energy analysis
๐ธ๐๐ด๐ถ๐๐ฆ๐๐๐๐๐ = ๐ผ โ ๐๐ โ ๐ถ๐ โ ๐2๐๐(๐๐๐ โ 3๐๐๐)/๐๐ถ๐๐
๐ธ๐๐ด๐ถ๐ ๐ก๐๐ก๐๐ = ๐๐๐๐ โ ๐๐๐ โ ๐ผ๐๐๐๐ โ ๐๐ท๐๐๐๐ก๐๐ . . .
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
Energy Analysis
Equations associated with every aspect of the hardware
A/D conversion energy analysis
Energy Analysis
Parameters of energy analysis equations
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
Energy Analysis
Parameters of energy analysis equations
[F. V. R. Oliveira et al, โGaussian Pyramid: Comparative Analysis of Hardware Architectures,โ IEEE Transactions on Circuits and Systems I, in press, 2017]
Conclusions
โข Hypothesis early vision stages can be accelerated at the focal plane at low energy cost by adding extra per-pixel circuitry
โข Comprehensive analysis for Gaussian pyramid generation with minimum pixel area impact
โข Major conclusion Potential advantages of focal-plane processing are case-specific
โข A/D conversion: critical stage
โข Regarding processing time, the focal-plane approach ideally requires one ADC per column to report significant advantages
โข Regarding energy saving, the focal-plane approach renders best results for SAR, cyclic or ๐บ๐ซ