an embedded architecture for feature detection using modified sift algorithm · 2016-10-10 ·...
TRANSCRIPT
http://www.iaeme.com/IJECET/index.asp 38 [email protected]
International Journal of Electronics and Communication Engineering and Technology (IJECET) Volume 7, Issue 5, Sep-Oct 2016, pp. 38–46, Article ID: IJECET_07_05_005
Available online at
http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=7&IType=5
Journal Impact Factor (2016): 8.2691 (Calculated by GISI) www.jifactor.com
ISSN Print: 0976-6464 and ISSN Online: 0976-6472
© IAEME Publication
AN EMBEDDED ARCHITECTURE FOR FEATURE
DETECTION USING MODIFIED SIFT ALGORITHM
Syama K Nair
Student, Department of ECE, Sree Buddha College of Engineering, Kerala, India.
Ragimol
Professor, Department of ECE, Sree Buddha College of Engineering, Kerala, India.
ABSTRACT
In video analytic and computer vision systems, image feature detection and matching is the
fundamental task in object recognition, image indexing, visual localization etc. These processes
establishes the correspondence between two images taken at different time. To most of the embedded
systems, the challenge is its large complexity in computation. Real-time performance is a critical
demand to most of these applications, which require the detection and matching of the visual features
in real time. The thesis work proposes a new FPGA-based embedded architecture for image feature
detection. In this the Scale Invariant Feature Transform (SIFT) feature detector is used to reduce the
utilization of FPGA resources. The correspondence point pairs obtained can be either used in an
embedded application or accessed via Ethernet for remotely computer vision applications. Here a
modified Gaussian filter is used for keypoint detection, in which the adder-multiplier sections are
fused by using a modified Booth recoder design. This design will increase the performance of the
whole system with considerable reduction in delay, power consumption and hardware complexities.
The detected feature keypoints from the image can be described using feature descriptor module. The
binary stream of bits are obtained from the description section and finally a feature matching process
is done to identify whether the two images taken at two different time were matched or not. The
software used for the work is Xilinx ISE Design Suite with Verilog coding.
Key words: Computer vision systems, Feature descriptor, Gaussian filter, Modified Booth Recoder,
SIFT detector.
Cite this Article: Syama K Nair and Ragimol, An Embedded Architecture for Feature Detection
Using Modified Sift Algorithm, International Journal of Electronics and Communication
Engineering and Technology, 7(5), 2016, pp. 38–46.
http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=7&IType=5
1. INTRODUCTION
Image processing is a field of signal processing, in which an image is taken as input and obtained the output
in the form of an image or as a set of parameters related to the characteristics of the image. The important
field of interest in video analytical and computer vision systems includes object recognition, pattern
identification, image stitching, panoramic vision, robot navigation etc. in which the image feature detection
and matching is a fundamental task. Feature detection is the first operation done in image processing as it is
An Embedded Architecture for Feature Detection Using Modified Sift Algorithm
http://www.iaeme.com/IJECET/index.asp 39 [email protected]
a part of a larger image processing algorithm. According to the applications, each image is processed by its
every pixel values to detect the feature points which are known to be the interest points or keypoints. Several
feature detection algorithms are already implemented to find the interest points from an image data. Some
of the efficient detection algorithms are corner edge detector, Harris detector, Laplacian blob detector, SIFT,
SURF etc.
The commonly used feature extraction algorithms after detectors are SIFT, SURF, BRIEF, BRISK,
FREAK etc. Among these SIFT (Scale Invariant Feature Transform) is the most efficient feature detection
method due to its invariant properties. It has several advantages over the other detection methods based on
the invariance in rotation, scaling, illumination changes etc. The thesis work aims to implement an efficient
and fast SIFT feature detection algorithm which are used for various image processing applications. Here
the filtering section used in the detector is optimized by the fusion of add-multiply section into modified
booth recoder method in order to achieve better reduction in power consumption, delay, area etc. It combines
the stability and repeatability of SIFT detection by optimizing the design to meet the real time requirements.
The proposed design is known to be one of the fastest feature detection module with less resource utilization
on FPGA.
The whole work is organized into different sections. The section 2 presents the related works
corresponding to the proposed system. It starts with the early works and the review reports related to the
previous FPGA implementations on the image feature detection algorithms. In section 3 the proposed SIFT
algorithm for feature detection process is discussed with the modified Gaussian filter used. The section 4
deals with the various technical details of the design such as its implementation on FPGA and also testing is
carried out in this section. In the next section the conclusion of the entire work is given.
2. RELATED WORKS
Lowe proposed the scale-invariant feature transform (SIFT) algorithm [1] which has been considered as one
of the most efficient robust approaches in feature detection. SIFT was an interesting feature detection
operator for better image matching purposes. Speeded up robust features (SURF) detector was discussed by
Herbert Bay [2] as a faster integer approximation detector inspired by SIFT algorithm. It was faster for
computation and better invariant to scale, rotation, and illumination changes of the image.
Zhong et al. [3] presented a design for SIFT feature detection and description based on FPGA+DSP
architecture for 320 × 256 images. SIFT has shown a great success in various computer vision applications,
even though it is more complex in hardware implementation. Bonato et al. [4] presented an FPGA-based
architecture for SIFT feature detection. In this design a parallel hardware architecture for image feature
detection based on the Scale Invariant Feature Transform algorithm was proposed. This work presents an
optimized robotic control system on-a-chip for an embedded platform.
A design of SIFT feature detection for HDTV was proposed in [5] by Mizuno et al. based on ASIC. Here
the images were stored in the external SDRAM and architecture detects the features in each regions of
interests (ROI) divided earlier. The area utilization of ROI is less compared to the image and this will
effectively reduce the chip memory utilization during implementation. The major drawback is that the input
image should be stored in an external memory.
Several variations are available for the SIFT detection in algorithmic level. One important SIFT variation
is the size shortening of the feature detected by using principal component analysis (PCA) as proposed in
[6]. Instead of computing the image gradient orientation, PCA-SIFT computes the gradient magnitude first.
They are faster than original SIFT descriptor, but not stable for matching purposes. In [7] proposed a single
FPGA-based design to detect and match visual feature in real time for real-life applications. Here SIFT is
used for image feature detection by considering the stability and repeatability of the image matching system.
In order to achieve real-time performance, replace the SIFT descriptor by the BRIEF descriptor. The
proposed work is based on this design with modified filter section in SIFT detector.
Kostas et al. [8] proposed an optimized booth recoder for adder-multiplier designs. It is a structured
and efficient fused add-multiply (FAM) design with considerable reduction in delay, power, and hardware
Syama K Nair and Ragimol
http://www.iaeme.com/IJECET/index.asp 40 [email protected]
complexities. The proposed FAM is the most efficient and optimized approach used to accelerate the DSP
arithmetic operations. The technique used here is the Sum to Modified Booth (S-MB) recoding scheme
which provides the reduction in resource utilization during hardware implementations.
A new embedded architecture for image feature detection based on SIFT algorithm is proposed in this
design. Here the optimization of Gaussian filtering section is done by using a modified booth recoder section
in the add-multiply operation used in SIFT detection module. By using this modified Gaussian filter there is
a considerable reduction in delay, power and area utilization. Here the optimized recoder with conventional
signed half adders and full adders are used. Thus it reduces the hardware complexities during real time
implementation.
3. PROPOSED SYSTEM
The proposed work focuses on the pure hardware implementation of Scale Invariant Feature transform
(SIFT) algorithm for image feature detection. The process diagram of proposed system is shown in the Figure
1.The input data to the system is an image with appropriate resolution and the output obtained are several
key-points detected out from the key-point detection module.
Figure 1 Process Diagram of the Proposed System
The design is based on two stages of processing. It includes scale space extrema detection and key-point
localization. Here we are considering the reduction in complexity of software implementation by using
Xilinx system generator platform with Verilog HDL coding. The input image is driven from a digital camera
for real time implementation and the pixel values are recorded for the key point detection purpose. This is
an optimized pure software implementation of feature detector using the Xilinx ISIM simulator with Verilog
coding. Here the modification is done in the Gaussian filter section by reducing the resource utilization and
thus produce an optimized design for feature detection in SIFT algorithm.
In conventional SIFT detectors used in [7], a large amount of resources are utilized for the single
Gaussian filter section itself due to the bulk amount of adders and multipliers in this filtering process. This
will increases the computational and hardware complexities for the whole feature detection system. This will
be reduced by the replacement of conventional Gaussian filter with the modified filter section as proposed
in this work. Here the adder-multiplier sections in the Gaussian column filter and row filters are replaced by
a fused adder-multiplier (FAM) module with modified booth encoder. In order to avoid the complexity of
adder-multiplier units, here the sum of the two numbers are recoded earlier by using the Sum-to-Modified
Booth (S-MB) recoding scheme [8]. In effect this will reduce the partial products obtained during
multiplication process by half and also reduces the critical path delay and area of the entire Gaussian filter
section which in turn reduces the resource utilization of the entire SIFT detection module during FPGA
implementation.
After filtering the SIFT detection module is run by the difference of Gaussian images as source images.
It is based on the concept of stable keypoint detection by the identification of scale space extrema values.
These values are considered as stable keypoints of the source images applied to the detection module. For
feature detection process, SIFT detector is selected due to its several advantages such as stability, accuracy
and scale and rotation invariance. The detected keypoints are the local maxima or minima of the difference
of Gaussian pyramids constructed. SIFT is considered to be the most robust feature detection algorithm due
to the low contrast rejection and edge response rejection procedures carried out in the detection module.
An Embedded Architecture for Feature Detection Using Modified Sift Algorithm
http://www.iaeme.com/IJECET/index.asp 41 [email protected]
4. IMPLEMENTATION AND TESTING
In video analytical applications, detecting the feature points with less computational complexity is a
fundamental task. Among the detection algorithms, Scale Invariant Feature Transform (SIFT) is an efficient
one with good results. In SIFT the feature detection includes difference of Gaussian pyramid construction
and stable keypoint detection processes.
4.1. Difference of Gaussian Pyramid Construction
The main aim of the SIFT feature detector is to determine the variations in the local features of an image
with sufficient visual differences. They are mainly focused as the candidate for keypoints. For finding these
candidates we have to construct a Difference of Gaussian (DoG) pyramid. In this construction first a
Gaussian filtered image is obtained by convolving the input image I(x, y) with the Gaussian kernel K(x, y;
σ) represented in equation 2 where σ refers to the scale of the Gaussian kernel. Here the 2-D convolution is
carried out. The Gaussian filtered image is given as in equation 1.
G(x, y; σ) = conv2 (I(x, y), K(x, y; σ)) (1)
whereconv2 (,) refers to the two dimensional convolution.
K(x, y; σ) = (1/2πσ2) e-(x2+y2)/2σ2 (2)
The Difference of Gaussian D(x, y; σ) between two Gaussian filtered images are given by equation 3.
D(x, y; σ) = G(x, y; kσ) - G(x, y; σ) (3)
where k denotes the constant multiplicative factor.
Figure 2 System Architecture of SIFT feature detector
Syama K Nair and Ragimol
http://www.iaeme.com/IJECET/index.asp 42 [email protected]
The detection module provides a constant throughput defined by the number of octaves and scales in the
DoG section. In this design we are using 12 Gaussian filters for the two image octaves. The system
architecture is shown in Figure 2. Thus in the design of SIFT detector proposed by J. Wang in [7], utilizes
336 adder units and 192 multiplier sections. This computational complexity makes this design difficult for
the fast feature detection operation. So an idea of a new hardware design is proposed for the filter section by
combining the adders and multipliers used in [7]. Here the sequential adder-multiplier units are combined as
a Fused Adder-Multiplier (FAM) module which causes a reduction in the critical path delay of the entire
filter section.
4.2. Fused Adder-Multiplier Module
Here a new architecture with reduced adders and multipliers are proposed for Gaussian filter. The add-
multiply operations in the column filter and row filter sections are replaced by fused adder multiplier section
as in [8]. Here a modified booth recoder is used instead of adder-multiplier units. An efficient sum- to MB
(S-MB) recoding technique is used for recoding the sum of two consecutive bits of the input data (A & B)
that are to be added first. The S-MB recoding technique uses the Modified Booth Encoding scheme for
finding the sum. These sum are fed to the partial product generator section along with the multiplicand used
(X). The architecture of Fused Add-multiply (FAM) section is given in Figure 3.
Figure 3 Architecture of Fused Add-Multiply module
It is the most efficient and fast adder multiplier unit used in the Gaussian filter in the SIFT detector
section. Thus a Gaussian filter with fused adder multiplier unit is used as shown in Figure 4, such that the
adders and multipliers that increase the computational complexities in the previous designs are eliminated
in this design.
The filter section input is given from an image buffer module and it is first horizontally filtered and then
in vertical direction. Thus the row and column of the image gets smoothened and final section used ripple
carry adder units and divider section which divides the output from column filter and row filter by the SUM
= 1024 which is equal to the summation of gain of all the filters used. In each filter there are fourteen gain
values ‘k’ are used. Then the SUM is expressed as; SUM = Σ k = 210 = 1024.
An Embedded Architecture for Feature Detection Using Modified Sift Algorithm
http://www.iaeme.com/IJECET/index.asp 43 [email protected]
On implementing this design in the proposed system the filtered output obtained are similar compared
with the design implementation of Gaussian filter without using FAM. But the advantage is that the
performance is increased in this design with considerable reduction in power, critical path delay and area of
utilization. Thus it makes the advantage of filtering in a fast and more reliable method which is used further
in the SIFT detection module. From each octave five DoG outputs are obtained and they are fed to the
keypoint detection module to find out the stable keypoints.
Figure 4 Modified Gaussian filter
4.3. Stable Keypoint Detection Module
The feature detection module is a fully parallelized section with four separate modules as shown in Figure
5. They are window generator, extremum detection, edge response rejection and the low contrast rejection
modules as explained in [7].
Syama K Nair and Ragimol
http://www.iaeme.com/IJECET/index.asp 44 [email protected]
Figure 5 Overall architecture of feature detection module
Here the window generator module is to generate 3×3 pixel windows from the 5 X 5 DoG outputs
obtained. The extremum detection module is used for detecting the local maxima or the extreme value from
the windowed output by comparing each pixel value with its neighbors. The edge response rejection module
is used to eliminate edge features of the image which makes its key-points unstable. This was done by some
edge threshold value set as r = 10 in this design. The low contrast rejection module is also used to eliminate
or reject the less contrast features to make the key-point detection more accurate. Finally all these modules
are combined to implement the stable key-point detector section using SIFT. Due to its parallelized design
we can detect a large number of features in a defined clock frequency. By all these operations stable
keypoints are obtained as pixel values which are used next for the feature description section. After the
feature description the image descriptors obtained are used for feature matching purposes in image
processing applications.
The overall architecture of the stable key-point detection module is shown in Fig 6. This architecture is
designed and implemented in Xilinx ISim simulator using Verilog coding for obtaining the corresponding
keypoints from each difference of Gaussian output obtained. From each octave five DoG outputs are
obtained. Thus in this design around ten DoG output values are used for the stable feature detection process.
The simulated and synthesized SIFT detection module in Xilinx ISE simulator was implemented on
Spartan-3 FPGA starter kit board with XCS200-4FT256 package. The Spartan-3 Starter Board provides a
powerful, self-contained development platform for designs targeting the Spartan-3 FPGA from Xilinx.
During hardware implementation process these keypoint values are toggled one after another on the LEDs
shown in the FPGA board as shown in Figure 6.
An Embedded Architecture for Feature Detection Using Modified Sift Algorithm
http://www.iaeme.com/IJECET/index.asp 45 [email protected]
Figure 6 Hardware implementation of SIFT keypoint detector
Table 1 Comparison of design utilization of SIFT detectors with and without using FAM
PARAMETERS SIFT DETECTOR
WITHOUT FAM
SIFT DETECTOR
WITH FAM
Memory 475848 KB 309960 KB
Registers 488 140
Multipliers 96 12
Delay 34.065 ns 31.213 ns
As given in Table 1, the memory usage for the SIFT feature detector without FAM is 475848 KB and
for the modified feature detector with FAM the memory utilization is reduced to 309960 KB. The delay for
the previous feature detector section used in [7] is 34.065 ns and for the proposed system the delay is reduced
to 31.213 ns. Thus by using this optimized filter with FAM the proposed keypoint detection structure is fast
and efficient compared to the previous designs.
5. CONCLUSION
A computationally efficient and reliable feature detector based on SIFT algorithm is implemented and
analyzed using Xilinx ISE Design Suit 14.2 in this design. By this design a stable and reliable scale space
extrema detection is achieved based on SIFT algorithm. In this work an optimized Gaussian filter with fused
adder-multiplier section is used. This FAM incorporated filter section will reduce the delay, area and memory
utilization of the entire system during the hardware implementation procedure. In effect this design will
provide a fast and reliable SIFT feature detector with less resource utilization by comparing it with the
previous designs in the references. The future work includes an embedded SoC architecture incorporating
the SIFT detector with FAM structure and the feature descriptor and matching modules. In this an optimized
SIFT feature descriptor invariant to scale, rotation and orientation is also planned to implement on a single
FPGA chip with less resource utilization.
Syama K Nair and Ragimol
http://www.iaeme.com/IJECET/index.asp 46 [email protected]
ACKNOWLEDGEMENT
For the successful completion of this work, I would like to express my deep gratitude towards my guide
Prof. Ragimol for the valuable suggestions. And I would also like to thank all the reviewers for their esteemed
support and encouragement.
REFERENCE
[1] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60,
no. 2, pp. 91–110, 2004.
[2] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-up robust features (SURF),” Comput. Vision
Image Understand, vol. 110, no. 3, pp. 346–359, 2008.
[3] S. Zhong, J. Wang, L. Yan, L. Kang, and Z. Cao, “A real-time embedded architecture for SIFT,” J. Syst.
Arch., vol. 59, no. 1, pp. 16–29, Jan. 2013.
[4] V. Bonato, E. Marques, and G. A. Constantinides, “A parallel hardware architecture for scale and
rotation invariant feature detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 12,
pp. 1703–1712,Dec. 2008.
[5] John Wilson and S. Sakthivel, “Big Bang-Big Crunch Algorithm for Minimizing Power
Consumption by Embedded Systems”, International Journal of Electronics and Communication
Engineering and Technology (IJECET), 5(7), 2014, pp. 09–14
[6] K. Mizuno, H. Noguchi, G. He, Y. Terachi, T. Kamino, T. Fujinaga, S. Izumi, Y. Ariki, H.
Kawaguchi, and M. Yoshimoto, “A low-power real-time SIFT descriptor generation engine for
full-HDTV video recognition,” IEICE Trans. Electron., vol. 94, no. 4, pp. 448–457, Apr. 2011.
[7] Y. Ke and R. Sukthankar, “PCA-SIFT: A more distinctive representation for local image
descriptors,” in Proc. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit., vol. 2. Jun.–
Jul. 2004, pp. 506–513.
[8] Jianhui Wang, Sheng Zhong, Luxin Yan, and Zhiguo Cao “An Embedded System-on-Chip Architecture
for Real-time Visual Detection and Matching” IEEE Transactions on Circuits and Systems for video
technology, vol. 24, no. 3, pp. 525-538, March 2014.
[9] Ms. Yogita V. Bhagwat and Prof. S. A. Naveed, “Design of Embedded Based Three Phase Preventor and
Selector System for Industrial Appliances”, International Journal of Electronics and Communication
Engineering and Technology (IJECET), 4(1), 2013, pp. 183–188
[10] K. Tsoumanis, “An optimized modified booth recoder for efficient design of add-multiply operator”, IEEE
transactions on circuits and systems, vol.61, n0.4, pp.1133-1143, April 2014.