an embedded architecture for feature detection using modified sift algorithm · 2016-10-10 ·...

http://www.iaeme.com/IJECET/index.asp 38 [email protected]

International Journal of Electronics and Communication Engineering and Technology (IJECET) Volume 7, Issue 5, Sep-Oct 2016, pp. 38–46, Article ID: IJECET_07_05_005

Available online at

http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=7&IType=5

Journal Impact Factor (2016): 8.2691 (Calculated by GISI) www.jifactor.com

ISSN Print: 0976-6464 and ISSN Online: 0976-6472

© IAEME Publication

AN EMBEDDED ARCHITECTURE FOR FEATURE

DETECTION USING MODIFIED SIFT ALGORITHM

Syama K Nair

Student, Department of ECE, Sree Buddha College of Engineering, Kerala, India.

Ragimol

Professor, Department of ECE, Sree Buddha College of Engineering, Kerala, India.

ABSTRACT

In video analytic and computer vision systems, image feature detection and matching is the

fundamental task in object recognition, image indexing, visual localization etc. These processes

establishes the correspondence between two images taken at different time. To most of the embedded

systems, the challenge is its large complexity in computation. Real-time performance is a critical

demand to most of these applications, which require the detection and matching of the visual features

in real time. The thesis work proposes a new FPGA-based embedded architecture for image feature

detection. In this the Scale Invariant Feature Transform (SIFT) feature detector is used to reduce the

utilization of FPGA resources. The correspondence point pairs obtained can be either used in an

embedded application or accessed via Ethernet for remotely computer vision applications. Here a

modified Gaussian filter is used for keypoint detection, in which the adder-multiplier sections are

fused by using a modified Booth recoder design. This design will increase the performance of the

whole system with considerable reduction in delay, power consumption and hardware complexities.

The detected feature keypoints from the image can be described using feature descriptor module. The

binary stream of bits are obtained from the description section and finally a feature matching process

is done to identify whether the two images taken at two different time were matched or not. The

software used for the work is Xilinx ISE Design Suite with Verilog coding.

Key words: Computer vision systems, Feature descriptor, Gaussian filter, Modified Booth Recoder,

SIFT detector.

Cite this Article: Syama K Nair and Ragimol, An Embedded Architecture for Feature Detection

Using Modified Sift Algorithm, International Journal of Electronics and Communication

Engineering and Technology, 7(5), 2016, pp. 38–46.

http://www.iaeme.com/IJECET/issues.asp?JType=IJECET&VType=7&IType=5

1. INTRODUCTION

Image processing is a field of signal processing, in which an image is taken as input and obtained the output

in the form of an image or as a set of parameters related to the characteristics of the image. The important

field of interest in video analytical and computer vision systems includes object recognition, pattern

identification, image stitching, panoramic vision, robot navigation etc. in which the image feature detection

and matching is a fundamental task. Feature detection is the first operation done in image processing as it is

An Embedded Architecture for Feature Detection Using Modified Sift Algorithm


a part of a larger image processing algorithm. According to the applications, each image is processed by its

every pixel values to detect the feature points which are known to be the interest points or keypoints. Several

feature detection algorithms are already implemented to find the interest points from an image data. Some

of the efficient detection algorithms are corner edge detector, Harris detector, Laplacian blob detector, SIFT,

SURF etc.

The commonly used feature extraction algorithms after detectors are SIFT, SURF, BRIEF, BRISK,

FREAK etc. Among these SIFT (Scale Invariant Feature Transform) is the most efficient feature detection

method due to its invariant properties. It has several advantages over the other detection methods based on

the invariance in rotation, scaling, illumination changes etc. The thesis work aims to implement an efficient

and fast SIFT feature detection algorithm which are used for various image processing applications. Here

the filtering section used in the detector is optimized by the fusion of add-multiply section into modified

booth recoder method in order to achieve better reduction in power consumption, delay, area etc. It combines

the stability and repeatability of SIFT detection by optimizing the design to meet the real time requirements.

The proposed design is known to be one of the fastest feature detection module with less resource utilization

on FPGA.

The whole work is organized into different sections. The section 2 presents the related works

corresponding to the proposed system. It starts with the early works and the review reports related to the

previous FPGA implementations on the image feature detection algorithms. In section 3 the proposed SIFT

algorithm for feature detection process is discussed with the modified Gaussian filter used. The section 4

deals with the various technical details of the design such as its implementation on FPGA and also testing is

carried out in this section. In the next section the conclusion of the entire work is given.

2. RELATED WORKS

Lowe proposed the scale-invariant feature transform (SIFT) algorithm [1] which has been considered as one

of the most efficient robust approaches in feature detection. SIFT was an interesting feature detection

operator for better image matching purposes. Speeded up robust features (SURF) detector was discussed by

Herbert Bay [2] as a faster integer approximation detector inspired by SIFT algorithm. It was faster for

computation and better invariant to scale, rotation, and illumination changes of the image.

Zhong et al. [3] presented a design for SIFT feature detection and description based on FPGA+DSP

architecture for 320 × 256 images. SIFT has shown a great success in various computer vision applications,

even though it is more complex in hardware implementation. Bonato et al. [4] presented an FPGA-based

architecture for SIFT feature detection. In this design a parallel hardware architecture for image feature

detection based on the Scale Invariant Feature Transform algorithm was proposed. This work presents an

optimized robotic control system on-a-chip for an embedded platform.

A design of SIFT feature detection for HDTV was proposed in [5] by Mizuno et al. based on ASIC. Here

the images were stored in the external SDRAM and architecture detects the features in each regions of

interests (ROI) divided earlier. The area utilization of ROI is less compared to the image and this will

effectively reduce the chip memory utilization during implementation. The major drawback is that the input

image should be stored in an external memory.

Several variations are available for the SIFT detection in algorithmic level. One important SIFT variation

is the size shortening of the feature detected by using principal component analysis (PCA) as proposed in

[6]. Instead of computing the image gradient orientation, PCA-SIFT computes the gradient magnitude first.

They are faster than original SIFT descriptor, but not stable for matching purposes. In [7] proposed a single

FPGA-based design to detect and match visual feature in real time for real-life applications. Here SIFT is

used for image feature detection by considering the stability and repeatability of the image matching system.

In order to achieve real-time performance, replace the SIFT descriptor by the BRIEF descriptor. The

proposed work is based on this design with modified filter section in SIFT detector.

Kostas et al. [8] proposed an optimized booth recoder for adder-multiplier designs. It is a structured

and efficient fused add-multiply (FAM) design with considerable reduction in delay, power, and hardware

Syama K Nair and Ragimol


complexities. The proposed FAM is the most efficient and optimized approach used to accelerate the DSP

arithmetic operations. The technique used here is the Sum to Modified Booth (S-MB) recoding scheme

which provides the reduction in resource utilization during hardware implementations.

A new embedded architecture for image feature detection based on SIFT algorithm is proposed in this

design. Here the optimization of Gaussian filtering section is done by using a modified booth recoder section

in the add-multiply operation used in SIFT detection module. By using this modified Gaussian filter there is

a considerable reduction in delay, power and area utilization. Here the optimized recoder with conventional

signed half adders and full adders are used. Thus it reduces the hardware complexities during real time

implementation.

3. PROPOSED SYSTEM

The proposed work focuses on the pure hardware implementation of Scale Invariant Feature transform

(SIFT) algorithm for image feature detection. The process diagram of proposed system is shown in the Figure

1.The input data to the system is an image with appropriate resolution and the output obtained are several

key-points detected out from the key-point detection module.

Figure 1 Process Diagram of the Proposed System

The design is based on two stages of processing. It includes scale space extrema detection and key-point

localization. Here we are considering the reduction in complexity of software implementation by using

Xilinx system generator platform with Verilog HDL coding. The input image is driven from a digital camera

for real time implementation and the pixel values are recorded for the key point detection purpose. This is

an optimized pure software implementation of feature detector using the Xilinx ISIM simulator with Verilog

coding. Here the modification is done in the Gaussian filter section by reducing the resource utilization and

thus produce an optimized design for feature detection in SIFT algorithm.

In conventional SIFT detectors used in [7], a large amount of resources are utilized for the single

Gaussian filter section itself due to the bulk amount of adders and multipliers in this filtering process. This

will increases the computational and hardware complexities for the whole feature detection system. This will

be reduced by the replacement of conventional Gaussian filter with the modified filter section as proposed

in this work. Here the adder-multiplier sections in the Gaussian column filter and row filters are replaced by

a fused adder-multiplier (FAM) module with modified booth encoder. In order to avoid the complexity of

adder-multiplier units, here the sum of the two numbers are recoded earlier by using the Sum-to-Modified

Booth (S-MB) recoding scheme [8]. In effect this will reduce the partial products obtained during

multiplication process by half and also reduces the critical path delay and area of the entire Gaussian filter

section which in turn reduces the resource utilization of the entire SIFT detection module during FPGA

implementation.

After filtering the SIFT detection module is run by the difference of Gaussian images as source images.

It is based on the concept of stable keypoint detection by the identification of scale space extrema values.

These values are considered as stable keypoints of the source images applied to the detection module. For

feature detection process, SIFT detector is selected due to its several advantages such as stability, accuracy

and scale and rotation invariance. The detected keypoints are the local maxima or minima of the difference

of Gaussian pyramids constructed. SIFT is considered to be the most robust feature detection algorithm due

to the low contrast rejection and edge response rejection procedures carried out in the detection module.



4. IMPLEMENTATION AND TESTING

In video analytical applications, detecting the feature points with less computational complexity is a

fundamental task. Among the detection algorithms, Scale Invariant Feature Transform (SIFT) is an efficient

one with good results. In SIFT the feature detection includes difference of Gaussian pyramid construction

and stable keypoint detection processes.

4.1. Difference of Gaussian Pyramid Construction

The main aim of the SIFT feature detector is to determine the variations in the local features of an image

with sufficient visual differences. They are mainly focused as the candidate for keypoints. For finding these

candidates we have to construct a Difference of Gaussian (DoG) pyramid. In this construction first a

Gaussian filtered image is obtained by convolving the input image I(x, y) with the Gaussian kernel K(x, y;

σ) represented in equation 2 where σ refers to the scale of the Gaussian kernel. Here the 2-D convolution is

carried out. The Gaussian filtered image is given as in equation 1.

G(x, y; σ) = conv2 (I(x, y), K(x, y; σ)) (1)

whereconv2 (,) refers to the two dimensional convolution.

K(x, y; σ) = (1/2πσ2) e-(x2+y2)/2σ2 (2)

The Difference of Gaussian D(x, y; σ) between two Gaussian filtered images are given by equation 3.

D(x, y; σ) = G(x, y; kσ) - G(x, y; σ) (3)

where k denotes the constant multiplicative factor.

Figure 2 System Architecture of SIFT feature detector



The detection module provides a constant throughput defined by the number of octaves and scales in the

DoG section. In this design we are using 12 Gaussian filters for the two image octaves. The system

architecture is shown in Figure 2. Thus in the design of SIFT detector proposed by J. Wang in [7], utilizes

336 adder units and 192 multiplier sections. This computational complexity makes this design difficult for

the fast feature detection operation. So an idea of a new hardware design is proposed for the filter section by

combining the adders and multipliers used in [7]. Here the sequential adder-multiplier units are combined as

a Fused Adder-Multiplier (FAM) module which causes a reduction in the critical path delay of the entire

filter section.

4.2. Fused Adder-Multiplier Module

Here a new architecture with reduced adders and multipliers are proposed for Gaussian filter. The add-

multiply operations in the column filter and row filter sections are replaced by fused adder multiplier section

as in [8]. Here a modified booth recoder is used instead of adder-multiplier units. An efficient sum- to MB

(S-MB) recoding technique is used for recoding the sum of two consecutive bits of the input data (A & B)

that are to be added first. The S-MB recoding technique uses the Modified Booth Encoding scheme for

finding the sum. These sum are fed to the partial product generator section along with the multiplicand used

(X). The architecture of Fused Add-multiply (FAM) section is given in Figure 3.

Figure 3 Architecture of Fused Add-Multiply module

It is the most efficient and fast adder multiplier unit used in the Gaussian filter in the SIFT detector

section. Thus a Gaussian filter with fused adder multiplier unit is used as shown in Figure 4, such that the

adders and multipliers that increase the computational complexities in the previous designs are eliminated

in this design.

The filter section input is given from an image buffer module and it is first horizontally filtered and then

in vertical direction. Thus the row and column of the image gets smoothened and final section used ripple

carry adder units and divider section which divides the output from column filter and row filter by the SUM

= 1024 which is equal to the summation of gain of all the filters used. In each filter there are fourteen gain

values ‘k’ are used. Then the SUM is expressed as; SUM = Σ k = 210 = 1024.



On implementing this design in the proposed system the filtered output obtained are similar compared

with the design implementation of Gaussian filter without using FAM. But the advantage is that the

performance is increased in this design with considerable reduction in power, critical path delay and area of

utilization. Thus it makes the advantage of filtering in a fast and more reliable method which is used further

in the SIFT detection module. From each octave five DoG outputs are obtained and they are fed to the

keypoint detection module to find out the stable keypoints.

Figure 4 Modified Gaussian filter

4.3. Stable Keypoint Detection Module

The feature detection module is a fully parallelized section with four separate modules as shown in Figure

5. They are window generator, extremum detection, edge response rejection and the low contrast rejection

modules as explained in [7].



Figure 5 Overall architecture of feature detection module

Here the window generator module is to generate 3×3 pixel windows from the 5 X 5 DoG outputs

obtained. The extremum detection module is used for detecting the local maxima or the extreme value from

the windowed output by comparing each pixel value with its neighbors. The edge response rejection module

is used to eliminate edge features of the image which makes its key-points unstable. This was done by some

edge threshold value set as r = 10 in this design. The low contrast rejection module is also used to eliminate

or reject the less contrast features to make the key-point detection more accurate. Finally all these modules

are combined to implement the stable key-point detector section using SIFT. Due to its parallelized design

we can detect a large number of features in a defined clock frequency. By all these operations stable

keypoints are obtained as pixel values which are used next for the feature description section. After the

feature description the image descriptors obtained are used for feature matching purposes in image

processing applications.

The overall architecture of the stable key-point detection module is shown in Fig 6. This architecture is

designed and implemented in Xilinx ISim simulator using Verilog coding for obtaining the corresponding

keypoints from each difference of Gaussian output obtained. From each octave five DoG outputs are

obtained. Thus in this design around ten DoG output values are used for the stable feature detection process.

The simulated and synthesized SIFT detection module in Xilinx ISE simulator was implemented on

Spartan-3 FPGA starter kit board with XCS200-4FT256 package. The Spartan-3 Starter Board provides a

powerful, self-contained development platform for designs targeting the Spartan-3 FPGA from Xilinx.

During hardware implementation process these keypoint values are toggled one after another on the LEDs

shown in the FPGA board as shown in Figure 6.



Figure 6 Hardware implementation of SIFT keypoint detector

Table 1 Comparison of design utilization of SIFT detectors with and without using FAM

PARAMETERS SIFT DETECTOR

WITHOUT FAM

SIFT DETECTOR

WITH FAM

Memory 475848 KB 309960 KB

Registers 488 140

Multipliers 96 12

Delay 34.065 ns 31.213 ns

As given in Table 1, the memory usage for the SIFT feature detector without FAM is 475848 KB and

for the modified feature detector with FAM the memory utilization is reduced to 309960 KB. The delay for

the previous feature detector section used in [7] is 34.065 ns and for the proposed system the delay is reduced

to 31.213 ns. Thus by using this optimized filter with FAM the proposed keypoint detection structure is fast

and efficient compared to the previous designs.

5. CONCLUSION

A computationally efficient and reliable feature detector based on SIFT algorithm is implemented and

analyzed using Xilinx ISE Design Suit 14.2 in this design. By this design a stable and reliable scale space

extrema detection is achieved based on SIFT algorithm. In this work an optimized Gaussian filter with fused

adder-multiplier section is used. This FAM incorporated filter section will reduce the delay, area and memory

utilization of the entire system during the hardware implementation procedure. In effect this design will

provide a fast and reliable SIFT feature detector with less resource utilization by comparing it with the

previous designs in the references. The future work includes an embedded SoC architecture incorporating

the SIFT detector with FAM structure and the feature descriptor and matching modules. In this an optimized

SIFT feature descriptor invariant to scale, rotation and orientation is also planned to implement on a single

FPGA chip with less resource utilization.



ACKNOWLEDGEMENT

For the successful completion of this work, I would like to express my deep gratitude towards my guide

Prof. Ragimol for the valuable suggestions. And I would also like to thank all the reviewers for their esteemed

support and encouragement.

REFERENCE

[1] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60,

no. 2, pp. 91–110, 2004.

[2] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-up robust features (SURF),” Comput. Vision

Image Understand, vol. 110, no. 3, pp. 346–359, 2008.

[3] S. Zhong, J. Wang, L. Yan, L. Kang, and Z. Cao, “A real-time embedded architecture for SIFT,” J. Syst.

Arch., vol. 59, no. 1, pp. 16–29, Jan. 2013.

[4] V. Bonato, E. Marques, and G. A. Constantinides, “A parallel hardware architecture for scale and

rotation invariant feature detection,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 12,

pp. 1703–1712,Dec. 2008.

[5] John Wilson and S. Sakthivel, “Big Bang-Big Crunch Algorithm for Minimizing Power

Consumption by Embedded Systems”, International Journal of Electronics and Communication

Engineering and Technology (IJECET), 5(7), 2014, pp. 09–14

[6] K. Mizuno, H. Noguchi, G. He, Y. Terachi, T. Kamino, T. Fujinaga, S. Izumi, Y. Ariki, H.

Kawaguchi, and M. Yoshimoto, “A low-power real-time SIFT descriptor generation engine for

full-HDTV video recognition,” IEICE Trans. Electron., vol. 94, no. 4, pp. 448–457, Apr. 2011.

[7] Y. Ke and R. Sukthankar, “PCA-SIFT: A more distinctive representation for local image

descriptors,” in Proc. IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit., vol. 2. Jun.–

Jul. 2004, pp. 506–513.

[8] Jianhui Wang, Sheng Zhong, Luxin Yan, and Zhiguo Cao “An Embedded System-on-Chip Architecture

for Real-time Visual Detection and Matching” IEEE Transactions on Circuits and Systems for video

technology, vol. 24, no. 3, pp. 525-538, March 2014.

[9] Ms. Yogita V. Bhagwat and Prof. S. A. Naveed, “Design of Embedded Based Three Phase Preventor and

Selector System for Industrial Appliances”, International Journal of Electronics and Communication

Engineering and Technology (IJECET), 4(1), 2013, pp. 183–188

[10] K. Tsoumanis, “An optimized modified booth recoder for efficient design of add-multiply operator”, IEEE

transactions on circuits and systems, vol.61, n0.4, pp.1133-1143, April 2014.

an embedded architecture for feature detection using modified sift algorithm · 2016-10-10 ·...

Documents