swift template matching based on equivalent...

Wangsheng Yu, Xiaohua Tian, Zhiqiang Hou*

Telecommunications Engineering Institute Air Force Engineering University

Xi’an, PR China *corresponding author: [email protected]

Chongzhao Han School of Electronics and Information Engineering

Xi'an Jiaotong University Xi’an, PR China

[email protected]

Abstract—Histogram-based template matching is an important method to search the globe optimization exhaustively. However, this method is commonly algorithmic complex. In this paper, we propose to replace the traditional histogram-based method with equivalent histogram-based method, which distinctly improves the matching efficiency. We first introduce the equivalent histogram on the basis of the relative centralization of the template’s color information and prove the equivalence. Then, we discuss the application of equivalent histogram in the current algorithms and analyze the algorithmic complexity. The equivalent histogram calculates the histograms and their distances according to the relative centralization of color information, which decreases the memory and computation spending from the calculation of redundant information. The experimental results indicate that equivalent histogram-based method remarkably improves the matching efficiency with no degradation of matching effect.

Keywords- visual tracking, template matching, exhaustive search, equivalent histogram

I. INTRODUCTION�

Template matching is an important technique in visual tracking, which is widely used in content based image retrieval, object detection and tracking. Compared with Particle Filter [1] and Mean-Shift [2] method, its principle is very simple and it can obtain a more robust result through global optimal exhaustive search. Compared with Optical Flow [3] method, Frame Difference [4] method and Background Subtract [5] method, it doesn’t confine to the static situation and owns a better adaptability. The traditional method implements template matching using 2-D correlation directly, whose computational complexity is O(N2r2) when matching a template sized r r� to an image with size of N N� . An effective method to reduce the complexity is to introduce Integral image [5] during the matching course. In 2004, Viola proposed an efficient method to calculate histogram based on Summed-area [6] method. On the basis of this, Li P. [8] proposed an exhaustive search method to track visual objects. The experimental results showed that his method is faster and more precise than the Mean-Shift method. Porikli analyzed the Integral image and proposed a novel algorithm to calculate

This research is supported by National Natural Science Foundation of China (Grant No. 60805015 and 61175029) and Natural Science Foundation of Shaanxi Province of China (Grant No. 2011JM8015)

histogram, which is named Integral Histogram (IH) [9]. IH based method can track the moving object swiftly and robustly with a low complexity of O(N2B) to a B dimension histogram calculation problem. However, it takes too much memory space to calculate histogram using IH algorithm. To resolve this problem, Sizintsev proposed Distributive Histogram (DH) [10], which renews histogram using the changed pixels in the current window, and utilizes the renewed histogram to calculate the histogram distance rather than writes it to memory. This DH based method not only improved the matching efficiency but also reduced the memory spending remarkably. In 2010, Wei did further research on reducing the computational complexity and proposed an efficient histogram-based sliding window (EHSW) model [11]. He pointed out that only a few histogram bins which changed during the window sliding affects the final histogram distance. So he calculated the histogram distance to construct similarity map using the differential theory and further improved the efficiency of template matching.

Almost all the current template matching algorithms are based on rectangle template, so the current template matching based methods can not track a rotational object very well. To improve the performance of tracking rotational object, Adam proposed a Fragment-based method [12] using IH algorithm for its fast calculation of arbitrary rectangle area. He divided the rectangle window into a certain number of small rectangle areas and matched the sub- rectangles to improve the robust of matching. This method works well when the object is partly sheltered by background. Sizintsev put forward a Multiple Histogram-based method when discussing the application of DH algorithm. Similar approach appeared in Wei’s study [11] too. Besides, the matching theory is also used in Scale Invariant Feature Transform-SIFT [13] to match the key points. A more robust result invariant to rotation and scale can be obtained when matching the template combined with SIFT features.

However, both IH based method and DH based method spend much time to construct the similarity map. For a B dimension histogram, the complexity of similarity map construction is O(N2B). EHSW model utilized the changed histogram bins to renew the similarity map and reduced the computational complexity to O(N2·min(B, 2sr)). In fact, the dimension of histogram is still an important factor which affects the complexity.

Swift Template Matching Based on Equivalent Histogram

2413

This paper centralizes on how to reduce the dimension of histogram and proposes an Equivalent Histogram based method, which can distinctly reduce the complexity with no change to the matching result.

II. EQUIVALENT HISTOGRAM

This section will mainly introduce the conception of the Equivalent Histogram and the proof of its equivalence.

A. Conception In the template matching based tracking situation, the

object template usually owns distinct attribute properties such as color, shape, etc. For example, the color of object template may distribute in a certain range rather than the whole color axes. Fig.1 shows the color distribution of R, G, and B channels of the original image and object template. It is clearly that the distribution ranges of original image are much larger than those of object template.

Template matching algorithm usually calculates the histograms of object template and current window on the basis of the color distribution of the original image, and then calculates the histogram distance of object template and current windows to construct a similarity map, where the maxima similarity gives the best confidence of the real object’s location. In this situation, many irrelative pixels are involved in the whole course of histogram calculation and similarity map construction. The redundant calculation severely affected the matching efficiency.

Actually, if we calculate the histogram and distance according to the color distribution range of object template, the redundant calculation can be removed. Enlightened by Wei’s study [12], we tested a great deal of video sequences and analyzed the color distribution of object templates, and found that the color of object templates usually assemble in a certain range. So we propose to calculate color histogram and distance according to the assembled color information. As the completely equivalence of the matching results between template based method and image based method, we name the proposed method as Equivalent Histogram (Fig.2).

B. ProofThe matching result of Equivalent Histogram based

method is exactly the same as that of the traditional histogram based method, which will be proved in the following content.

Proof:

Suppose S is a rectangle area of original image with the same size of object template T, and then the histogram of Sand T can be calculated as follows:

� �� 1

( ) ( ) ,B

ii

H S h p f p X p S�

� � (1)

� �� 1

( ) ( ) ,B

jj

H T h p f p X p T�

� � (2)

In the formula, B is the number of bins of histogram (dimension), ( )f p is the gray value of pixel p (we take gray

Figure 2. Demonstration of the object template based histogram (equivalent histogram).

Figure 1. The difference between original image based histogram and object template based histogram.

2414

value of image for example to prove the equivalence). � �� ( ) ,ih p f p X p S calculates the number of elements in

� �( ) ,ip f p X p S , we marked it as Sh . iX is the gray value range of the ith bins of histogram which can be described as follows:

( 1.5) ( 0.5)1 1i

R i R iX x xB B

� � � � � � ��

� �� (3)

R in the formula is the scales of gray image, whose value is 255.

Let � �1 2, ,...,

nS S Sh h h�SH , � �1 2, ,...,

nT T Th h h�TH , and

� �1 2, ,..., TnX X X�X , then ( )H S and ( )H T can be

expressed as ( )H S � �SH X and ( )H T � �TH X . The plnorm based distance of S and T is:

( , )p

d S T � �S TH H (4)

Because the gray value assembled in a certain range, there must exists a lot of elements in TH with value of zero. So,

� ��

1 1

1 2 3

0,...,0, , ,..., , , 0,...,0

, ,k k l lT T T Th h h h

� ��

�

T

T T T

H

H H H (5)

In the formula, 1 0�TH , � �1 12 , ,..., ,k k l lT T T Th h h h

� ��TH ,

3 0�TH .

If calculate the histogram using the Equivalent Histogram based method, then only 2SH in � �1 2 3, ,S S SH H H needs to be calculated. According to the triangle inequality character of plnorm, the following inequality is tenable:

� � � ��

1 2 3 1 2 3

2 2 1 3 1 3

2 2 1 1 3 3

, , , ,

, ,

p p

p p

p p p

� � �

� � � �

� � � � � �

S T S S S T T T

S T S S T T

S T S T S T

H H H H H H H H

H H H H H H

H H H H H H

(6)

The equal mark in formula (6) comes into existence iff 1 1�S TH H , 2 2�S TH H and 3 3�S TH H . Here, the template

matching obtains the global optimal, and the optimal matching based on the original image ( 0

p� �S TH H ) and the optimal

matching based on Equivalent Histogram ( 2 2 0p

� �S TH H )

are exactly the same.

III. EQUIVALENT HISTOGRAM BASED TEMPLATE MATCHING

Equivalent Histogram can be used in many histogram based method such as IH algorithm, DH algorithm and EHSW model. This section will introduce the application of

Equivalent Histogram (Fig.3) and analyze the complexity about both computation and memory.

A. Implementation of matching Suppose the gray range of template T is � �min max,f f , then

the length of the gray range is max min'R f f� � . If B is the default setting of the numbers of histogram bins based on the original image which can obtain a comparatively better result, then the figure B can be reduce to the following number if carry out the matching process based on Equivalent Histogram:

''= RB BR� (7)

Here the gray value range described by the ith bins of histogram is as follows:

0 0' ( 1.5) ' ( 0.5)'

' 1 ' 1iR i B R i B

X x xB B

� � � � � � � � ��

(8)

In the formula, 0B is used to correct the corresponding relation between the bins and the gray value range which can be calculated as follows:

min0 = '

'fB BR

� (9)

Now the Equivalent Histogram of area S is:

� �� '

1

( ) ( ) ',B

ii

H S h p f p X p S�

� � (10)

Figure 3. The application of equivalent histogram.

2415

On the basis of the formula (10), the pixels whose gray value does not belong to � �min max,f f will not participate in the histogram and similarity map calculation. The histogram dimension becomes the � ( 'R R� � ) of the original, which does not only improve the computation efficiency but also reduce the memory spending.

B. Complexity analysis The proposed method does not only improve the matching

efficiency to reduce the circulation time but also cut down the space complexity to save memory. The following content will compare the algorithmic complexity of IH based method, DH based method, EHSW model and those who introduced the Equivalent Histogram.

For IH based method, there are twice B dimension plus (minus) operations and one pixel’s histogram renew of each pixel when constructing the integral histogram map. The calculation of the current window’s histogram needs three times B dimension plus (minus) operations. The final step is to calculate the distance of histograms between object template and current window. Suppose the time consumption of plus (minus) operation is 1t , the time consumption of one pixel’s histogram renew is 2t , and the time consumption of distance calculation of each bin is 3t , then, the total time consumption of IH based method is:

� �2IH 1 2 35t N Bt t Bt� � � (11)

If calculate the IH based on Equivalent Histogram (IH-EH), the histogram’s dimension becomes B� , and only the pixels whose gray value belong to the gray range of object template involve in the calculation. We mark the proportion of these pixels to the whole image is � , then the total time consumption of IH-EH based method is:

� �2IH-EH 1 2 35t N Bt t Bt� � �� (12)

For DH based method, the initialization step needs to calculate the histogram of Nr pixels, the distribution step needs twice histogram bin renew and twice B dimension plus (minus) operations to form the final similarity map. The total time consumption of DH based method is:

� �2 2 2DH 1 2 32 2t N Bt Nr N t N Bt� � � � (13)

If introduce the Equivalent Histogram when calculate the DH, the total time consumption becomes:

� �2 2 2DH-EH 1 2 32 2t N Bt Nr N t N Bt� � �� (14)

EHSW model uses the changed pixels in current window to renew the histogram bins and further renew the similarity

map. The whole course bypassed the histogram calculation of each pixel. The preprocess step is just the same as DH algorithm, and during the window sliding course, only 2srpixels affected the histogram distance (s is a sparseness factor which describe the ratio of the renewed pixels to the boundary). So the complexity of EHSW model based method is � �� 2 min , 2O N B sr� . We will not further analyze the total time consumption of EHSW for the reason that it is affected by the factor s and the distance measurement of histogram. If introduce the Equivalent Histogram to EHSW, the total complexity will reduce to � �� 2 min , 2O N B sr� �� .

IH based method is very memory consuming and its memory complexity is up to 2N B . While the figures of DH based method and EHSW model are both NB , which is less memory consuming than IH based method. If introduce the Equivalent Histogram, all the memory demand will reduce to the � of the original ones.

It is clearly that both the factor � and � are less than 1, so the Equivalent Histogram based method effectively reduced the complexity of both computation and memory.

IV. EXPERIMENTAL RESULTS

To validate the efficiency of the Equivalent Histogram, we design and carry out a series of tracking experiments and pick out three tracking examples to further analyze in this section. The hardware condition during all the experiments is 2.10GHz basic frequency and 2G memory space. The simulation software is MATLAB 2010.

Fig.4 lists the object templates of the test video sequences. Among the templates, (a) is car, (b) is pedestrian, and (c) is face. To be point out is that all the tracking experiments in the paper are based on gray histogram.

A. Tracking results IH, DH and EHSW are all implemented on the basis of the

histogram of rectangle area, so the tracking results of these three methods are accurately the same. During the experiments, firstly, we tested the tracking performances of IH based method, DH based method and EHSW based method, and then tested these three methods on the basis of Equivalent Histogram. We wrote down all the tracking results to do further comparison.

Fig.5 shows the tracking results of a car. The top of each part are the similarity maps obtained from the two test method.

Figure 4. Object templates of three test video sequences. Here gives the color templates, but actually we only used their gray information.

2416

We can see from the similarity maps that the locations of real object give the highest similarities. On each frame, the tracking result of IH based method is circled by a blue rectangle, while the result of the revised IH [9] based method which introduced Equivalent Histogram (IH+EH) is circled by a green rectangle. This revised method cut down the redundant computation when calculates the histogram of the current window and the distance between two histograms. The results of the two test methods are exactly the same which not only can be seen from the location of the circled rectangles but also proved by the precise position recorded during the experiments.

Fig.6 shows the tracking results of the method based on distributive histogram (DH) [10] and the one revised according to Equivalent Histogram (DH+EH). The top right similarity map in each part of Fig. 6 reveals the real object (pedestrian) with a more distinguishable confidence, which

makes the tracking result more reliable. Note that the similarity maps calculated from DH and DH+EH are different. It is the reason that the cutting down of redundant calculation makes the difference. The tracking results indicate that effect of DH is exactly the same as that of DH+EH.

Fig.7 lists the results of tracking a face, among which, the left are the ones obtained using efficient histogram based sliding window (EHSW) [11], and the right are the ones obtained by the revised EHSW which brings in the Equivalent Histogram. We can see from the similarity map that the similarities between two maps are quite different except the real object. The color of the object (face) assembled in a certain part of the color axes, which reduces the computation of matching when introducing Equivalent Histogram. The next section will treat with this problem. The tracking results of two test method are exactly the same, which can be seen from the circled rectangles with different colors.

(a) left: DH, right: DH+EH, top: similarity maps, bottom: tracking results

(b) left: DH, right: DH+EH, top: similarity maps, bottom: tracking results

Figure 6. Tracking results of “pedestrian”. The left are results of DH based method (DH) and the right are revised DH based method (DH+EH). During each part of this figure, the top two images show the similarity maps, and the bottom two images show the tracking results of two test method.

(a) left: IH, right: IH+EH, top: similarity maps, bottom: tracking results

(b) left: IH, right: IH+EH, top: similarity maps, bottom: tracking results

Figure 5. Tracking results of “car”. The left are results of IH based method (IH) and the right are revised IH based method (IH+EH). During each part of this figure, the top two images show the confidences of the template, which we called “similarity maps”, and the bottom two images show the tracking results of two test method.

2417

The experiments in this section indicate that the bringing in of Equivalent Histogram makes some changes in similarity map, however, does no change of the tracking results of IH based method, DH based method and EHSW based method. We will further discuss the performance changes in respect that bringing in of Equivalent Histogram.

B. Performance analysis There are two factors which affect the performance of

Equivalent Histogram, one is the ratio of the object template’s color range to the original image’s color range (� ), and the other is the ratio of the essential pixels to the whole pixels ( � ). A smaller � means the color of object template assembles a narrower range, which may better improve the marching efficiency. On the other hand, a larger � means that the color assembling phenomenon of template is not obvious. Compared with � , � has weaker effect on improving the

computation efficiency. For the first sequence, =0.7203� ,=0.3946� , and the run time per frame of IH based method is

2753 ms, while this figure of modified IH based method is 2075 ms. It means that the introduction of equivalent improves the IH based method by nearly 25 percents. More details about the amelioration of performance are listed in Tab.1.

From the figures in Tab.1, we can see that the methods on the basis of Equivalent Histogram are faster than the original ones. The average saved computation time is 31.3%, which means that the Equivalent Histogram has improved the efficiency indeed.

According to the analysis in section 3, the factor � will basically reflect the amelioration extent; however, the test results are not exactly as what have been discussed in former. The test run times are usually longer than the theory analyzed results. It probably influenced by the initialization step and the optimization of program.

All the experiments in this paper are designed as an ideal situation and the algorithms searched the global optimization through the whole frames. It is known as exhaustive search which is usually very time-consuming. The real tracking application usually introduces Kalman filter theory to predicate the track and the moving performances, which can shrink the searching area to further improve efficiency

C. Discussion We demonstrate the validity of our approach to improve

the efficiency to execute histogram based template matching. However, this approach makes little sense when the template has the same color distribution to the frames. We give a failure example in Fig. 8. In this case, if we demonstrate the object with a rectangle, then the object and the whole frame share nearly the same color information distribution. Our approach doesn’t work in improving the calculation efficiency.

If we can segment the object out from the rectangle template using the apriority information and an effective

(a) left: EHSW, right: EHSW +EH, top: similarity maps, bottom: tracking results

(b) left: EHSW, right: EHSW +EH, top: similarity maps, bottom: tracking results

Figure 7. Tracking results of “face”. The left are results of EHSW based method (EHSW) and the right are revised EHSW based method (EHSW +EH). During each part of this figure, the top two images show the similarity maps, and the bottom two images show the tracking results of two test method.

TABLE I. COMPARISON OF RUN TIME ON EACH FRAME BETWEEN TRADITIONAL METHODS AND EQUIVALENT HISTOGRAM BASED METHODS

Video sequences car pedestrian face

� 0.7203 0.5668 0.5964

� 0.3946 0.7576 0.4503

Size of frame 640 480 384 288 128 96

Size of template 64 38 56 21 32 31

IH 2753 ms 763 ms 192 ms

IH+EH 2075 ms 504 ms 123 ms

DH 2038 ms 538 ms 157 ms

DH+EH 1514 ms 415 ms 122 ms

EHSW 1587 ms 399 ms 118 ms

EHSW+EH 1203 ms 340 ms 96 ms

2418

segmentation algorithm, our approach still works and the improvement will be more distinct. We refer to Fig. 8 to give a demonstration of this situation.

V. CONCLUSIONS

This paper proposed a method to improve the matching efficiency and reduce the complexity of histogram matching based visual tracking. It calculates the histogram of both object template and the current window according to the distribution of template’s gray information. Compared with traditional tracking method based on histogram matching, it does not only improve the matching efficiency to reduce the circulation time but also cut down the space complexity to save memory. The tracking experiments based on Equivalent Histogram using 64 bins gray histogram indicates that the introduction of Equivalent Histogram has no impact to the tracking result but distinctly improves the tracking efficiency by 30 to 40 percents. The improvement of color histogram based method which introduces the Equivalent Histogram will be much more obvious.

ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers for their many valuable comments and suggestions that helped to improve both the technical content and the presentation quality of this paper.

REFERENCES

[1] P. Pan, D. Schonfeld, “Visual Tracking Using High-Order Particle Filtering,” IEEE Signal Processing Letters, 8(1): 51-54, 2011.

[2] C. H. Shen, J. Kim, and H. Z. Wang, “Generalized Kernel-Based Visual Tracking,” IEEE Transactions on Circuits and Systems for Video Technology, 20(1): 119-130, 2010.

[3] S. S. Beauchemin, and J. L. Barron, “The computation of optical Flow,” ACM Computing Surveys, 27(3):433-467, 1995.

[4] S. Wang, H. Z. Ai and K. Z. He, “Difference-image based multiple motion targets detection and tracking,” Chinese Journal of Image and Graphics, 4A(6): 470-475, 1999.

[5] Z. Q. Hou, C. Z. Han, “A background reconstruction algorithm based on pixel intensity classification,” Chinese Journal of Software, 16(9): 1568-1576, 2005.

[6] F. Crow, “Summed-area tables for texture mapping,” ACM Computer Graphics, 18(3): 207-212, 1984.

[7] P. Viola and M. Jones, “Robust real-time face detection,” International Journal of Computer Vision, 57(2): 137-154, 2004.

[8] P. Li, “A clustering-based color model and integral images for fast object tracking,” Signal Processing: Image Communication, 21 (8): 676-687, 2006.

[9] F. Porikli, “Integral histogram: A fast way to extract histograms in Cartesian spaces,” Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR'05), San Diego, CA, USA, 20-25 Jun. 2005, vol.1, 829-836, 2005.

[10] M. Sizintsev, K. G. Derpanis and A. Hogue, “Histogram-based search: a comparative study,” Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition(CVPR'08), Anchorage, AK, 23-28 Jun. 2008, 1-8, 2008

[11] Y. Wei, L. Tao, “Efficient histogram-based sliding window,” Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition(CVPR'10), San Francisco, CA, 13-18 Jun. 2010, 3003-3010, 2010.

[12] A. Adam, E. Rivlin and I. Shimshoni, “Robust Fragments-based Tracking using the Integral Histogram,” Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), 17-22 Jun. 2006, 798-805, 2006.

[13] D. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of Computer Vision, 2(60): 91-110, 2004.

(a) template (left) and its segmentation (right)

(b) similarity maps of objects in (a)

(c) tracking results of objects in (a)

Figure 8. a failure example and discussion. If we segment the real object from the rectangle template, our approach still works and the improvement is much more distinct.

2419

swift template matching based on equivalent...

Documents