[ieee 2011 7th iranian conference on machine vision and image processing (mvip) - tehran, iran...

Mutual information-based image template matching with small template size

Hossein Soleimani Department of Electrical and Computer engineering

Isfahan University of Technology Isfahan, Iran

[email protected]

Mohammadali Khosravifard Department of Electrical and Computer engineering

Isfahan University of Technology Isfahan, Iran

[email protected]

Abstract— Mutual information plays crucial role of similarity measure in some applications such as image registration and image template matching. Therefore estimating the joint probability distribution of underlying image or template is the main problem. Non-Parametric window (NP) method considers the images as continuous two-dimensional signals and results an appropriate joint probability distribution. In this paper we employ a triangle distribution with a large support instead of uniform distribution in the original NP. This gives a more precise estimation of joint probability distribution. As a result the compared mutual information is more robust and reliable. Experimental results show the superiority of the proposed method in image template matching with small window size.

Keywords-component; Template matching; joint probability distribution; mutual information; Non-parametric window method

I. INTRODUCTION Image template matching is a process to find the location of

a given sub-image in an image. In other words, for a template with certain window size, we are looking for a part of the image with the same size, which has the maximum similarity to the template. Image template matching can be used in manufacturing as a part of quality control, a way to navigate a mobile robot, or as a way to detect edges in images [1]. There are many measures, such as mutual information and correlation criteria [2], to determine similarity between two images or sub-images. Mutual information is the most popular measure because of its robustness to intensity variation and noise, and high accuracy. It indicates any linear and non-linear correlation between two random variables. It also has other applications in multi-modal medical image registration [3] and moving object tracking [4] and anywhere that the similarity between two images is required.

The most important and critical step in calculating the mutual information of two random variables or two images is to find their joint probability distribution. Since the images are actually samples of two-dimensional continuous signals, calculation of their joint probability distributions is not error-free. Moreover, the number of samples will affect the resulting estimated joint probability distribution. Therefore, selection of a reliable method for estimating the joint probability distribution is of great importance.

Joint histogram or co-occurrence matrix was the first solution for calculating the joint probability distribution

function (JPDF). In order to improve the reliability of JPDF estimation, Parzen windowing [5] and PVE [6] interpolation were also proposed. Parzen windowing improves the estimation by filling empty nodes in the joint histogram (smoothing the histogram). As a disadvantage of these methods, it is essential to adjust some parameters. They show acceptable performance when the number of available samples is sufficient (i.e., the number of bins is large enough), so that the mutual information function is not discontinuous. Discontinuity of the mutual information function makes it difficult for optimization.

NP windowing is one of the best methods in estimation of joist probability distribution of two random variables (or two images). Since it treats the image as a continuous 2D signal, the number of bins or samples has less importance with respect to other methods. In contrast to the previously mentioned methods, with NP windowing the interpolating or smoothing is performed in the signal domain (rather than the probability domain).

For first time, NP method was presented by Kadir and Brady for estimating 1D signals [7]. It was improved for estimating joint probability distribution of 2D signals (images) [8]. It estimates signal statistics by directly calculating the distribution of each piecewise section of a signal for a given interpolation model. One of the most important advantages of this method is that it has no adjustable parameter. Since it considers both the intensity and location of the pixels, its output joint histogram is smooth and continuous. The price of such a quality is its time complexity [8]. In this paper, we extend the basic idea of NP windows and propose Modified Non-Parametric windows (MNP). Experimental results show that the estimated PDF is better than that derived by NP method.

The paper is organized as follows: Section 2 describes principle concept of mutual information. In Section 3 NP windows and the proposed algorithm are explained. Finally in Section 4 the performance of these methods are compared.

II. MUTUAL INFORMATION Mutual information is one of the basic concepts in

Information theory that indicates dependency between two random variables [10]. Firstly, this similarity measure has been used in medical image registration by Viola and Wells in 1995 [9].

978-1-4577-1535-8/11/$26.00 ©2011 IEEE

Two random variables A and B with marginal distributions )(apA and )(bpB and joint probability distribution ),(, bap BA ,

are independent iff )().(),(, bpapbap BABA = . Mutual

information ),( BAI is actually the Kullback-Leibler distance of

),(, bap BA and )().( bpap BA ,i.e.,

))().(

),(log().,(),( ,

,, bpap

bapbapBAI

BA

BA

baBA∑= (1)

Mutual information takes its maximum value when two images or two discrete random variables are absolutely dependent (i.e. one of them is a function of the other one). In this case the joint probability matrix or joint histogram is diagonal. If the images are independent, then the mutual information will be zero.

III. NP WINDOWS AND PROPOSED ALGHORITM For estimation of JPDF of two random variables by some

samples of each variable, there are several methods such as Parzen windowing and histogram calculating. Most of these methods are sensitive to number of samples or number of bins in the histogram. Estimated PDF by these methods are not so reliable and precise. NP windows have solved this problem.

A. Estimation of 1D signal PDF by proposed method

Assuming that the value of the signal f is equal to f 1 , f 2

, …, f n at the moments t1, t2, …, tn, respectively, for two moments ti and ti+1 , the following expression is used to estimate PDf of f function between two samples f i and f i+1.

baxxf +=)( 10 << x (2)

where x has uniform distribution in [0 1] and

a= f i+1- f i, b= f i. Therefore, PDf of the signal f can be obtained from (3) by using rules of probability theory.

))((.)(1

fxpxffp xF

−

∂∂= (3)

Where b

affx −=)( and axf =

∂∂ . Then PDF of f between

two samples is given by:

⎪⎩

⎪⎨⎧ ≤≤

= +

othereise

fffafp ii

F

0||

1)( 1

(4)

In the final step, all PDF's between two adjacent samples are added to each other and normalized to the number of neighbors. Workflow has been shown in Fig. 1. It is noticeable that in NP window method, the PDF function has been assumed zero out of the two samples interval [8]. Intuitively, it seems that if PDF function is not zero in vicinity of these samples and even out of their interval, then the estimated PDF

function will be more reliable and precise. Another remarkable note in NP method is using of uniform distribution for x. we propose to use linear distribution defined by (5) as distribution of x variable and extend the interval to

⎥⎦

⎤⎢⎣

⎡ −+

−− +

++

2,

21

11 ii

iii

i

fff

fff

(interval is doubled), i.e.,

⎪⎩

⎪⎨

⎧<<−

<<=

otherwisexx

xxxpX

015.)1(45.04

)( (5)

Therefore, if the linear interpolation (2) is used then

)(2 1 ii ffa −= + and 23 1+−

= ii ffb . Using (3), the probability

distribution of f between two samples is obtained as

⎪⎩

⎪⎨

⎧+<<+−+

+<<−=

otherwisebaxabfba

abxbbf

afpF

02/

2/4)( 2 (6)

Finally, similar to NP windows method, by adding all derived PDF's between two adjacent samples and normalizing them to the number of samples, PDf function is estimated.

Figure 1. Steps for estimating the probability distribution function. First row: Estimation of pf(f) by NP windows. Second row: Estimating of pf(f) by MNP

B. Estimation of marginal and joint probability distribution for 2D signals Extending our idea to 2D signals, we use bilinear

interpolation to estimate ),( 21 xxf . In bilinear interpolation

the following relation is used to estimate f between four samples f 1, f 2, f 3 and f 4.

dcxbxxaxf +++= 2121 (7)

⎪⎪⎩

⎪⎪⎨

⎧

=−=−=

+−−=

1

13

12

4321

fdffcffb

ffffa

(8)

With original NP method 1x and 2x are considered to have uniform joint probability distribution over [0 1] × [0 1]. Since f is 2D, a dummy variable g = 1x is introduced to determine

JPDF. Hence JPDF of f and g will be obtained by

)),(),,(()det(),( 21,),(),(, 2121gfxgfxpJgfp xxxxogfGF = (9)

where )det( ),(),( 21 xxogfJ denotes the absolute value of the

determinant of Jacobian of x1 and x2 with respect to f and g . The marginal PDF p( f ) can be obtained by integrating out g . With this bilinear interpolation some complicated mathematical expressions must be computed which is not practically implementable [8]. Thus half bilinear interpolation is suggested which uses three samples instead of four samples to interpolate

),( 21 xxf over a triangle. In other words, a rectangle constructed by four samples is divided into two triangles. In this way we use the following relations to interpolate

),( 21 xxf

cbxaxf ++= 21 (10)

⎪⎩

⎪⎨

⎧

=−=−=

1

13

12

fcffbffa

(11)

where f 1, f 2 and f 3 are the intensity of adjacent pixels. If the intensity of pixel a (i , j) is f 1 then its intensity in (i , j+1) and (i+1, j) are called f 2 and f 3, respectively. In Fig. 2 the interpolated image derived by half-bilinear interpolation with known four samples, f 1 =20, f 2 =40, f 3 =60 and f 4 = 80, is illustrated.

For a pair of 2D images f and g , half-bilinear interpolation uses

12111 cxbxaf ++= (12)

22212 cxbxag ++= (13)

Thus we have

Figure 2. a.dviding the region between four pixels in an image into two

triangles. b.interpolated image by half-bilinear

2121

122121211 baab

fbfbcbbcx−

+−−= (14)

2121

222121212 abba

fafacaacx−

+−−= (15)

and

.1)det(212122

11

),(),( 21 abbagx

fx

gx

fx

J xxogf −=

∂∂

∂∂

∂∂

∂∂

= (16)

Using (9) and noting that 1),( 21 =xxpX , for 10 1 ≤< x and 10 2 ≤< x , we have

2121,

1),(abba

gfp GF −= (17)

Three points ( f 1, g 1), ( f 2, g 2) and ( f 3, g 3) are three vertices of a triangle in 2D space of joint histogram and the area of triangle defined by these points is equal to ( ||5.0 2121 abba − ). The underlying rectangle consists of two triangles. Intensities of vertices of these triangles in image fand g , make two triangles in 2D space of joint histogram with two common vertexes. In joint histogram space, all points that are inside of triangle will be equal to the inverse of the area of triangle. Thus by each pair of three adjacent pixels in two images the total probability .5 is added to joint histogram. So the points in joint histogram inside of large triangles (when the adjacent pixels in two images have almost the same intensities) have low probability and points inside of small triangles (when adjacent pixels in two images have different intensities, like edges) have higher probability. Finally JPDF of f and g is obtained by normalization. Extending the proposed idea to 2D signals and noting that the length of the intervals should be doubled, coefficients a, b and c are given by

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

3

2

1

175.25.125.75.125.25.

fff

cba

(18)

and

.5.25.220022

3

2

1

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

−−−

=⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

fff

cba

(19)

Extension of the mentioned PDF by (5) to joint probability distribution yields to JPDF of 1x and 2x

⎪⎪⎪

⎩

⎪⎪⎪

⎨

⎧

−<<<<−<<<<

<<<<<<<<

×=

otherwisexxxx

xxxxxxxx

xxxx

xxpX

015.,5.0

10,15.5.,5.0

0,5.0

3),(

2122

2121

1222

2121

21 (20)

Using (9), (16) and (20), the joint probability distribution of f and g , is given by

2121

21,

),(),(abba

xxpgfp XGF −

= (21)

In order to prevent complicated computations, ),( 21 xxpX is not replaced by (14) and (15). Instead, we

propose to calculate 1x and 2x for any ( f 0, g 0) and substitute it in (21) to determine the joint probability of ( f 0, g 0).

Therefore, as it was mentioned in NP method, for any three adjacent pixels in image f and their corresponding coordinates in image g, a triangle will be formed in joint histogram space. Points which are out of this triangle and too far from it will take zero probability value but other points take a value equal to ),(, gfp GF , defined by (21).

An example of the constructed triangle in joint histogram by three pixels in image 1 and corresponded pixels in image 2 is shown in Fig.3. In NP method, as it is shown in Fig. 3.a points that are inside of the triangle take a constant value (inverse of triangle area). With MNP method, the support of JPDF is larger and consists of more points. These correspondent points do not take a constant value, as is shown in Fig. 3.b.

IV. EXPERIMENTAL RESULTS In [8] Non-parametric window method has been compared

with other template matching methods, such as Parzen windowing, PVE interpolation method and several other methods which use mutual information as similarity measure for template matching. It has been shown that the derived JPDF of two images by NP window is more exact and smooth.

Figure 3. Joint histogram estimated by three pixels in image 1 and

crospended pixels in image 2. a.joint histogram by NP method. b.joint histogram by MNP method.

Moreover, the performance of this method is more robust to the number of samples and bins than other methods. In this paper, performance of NP is compared with the proposed algorithm (MNP). In order to investigate the performance of MNP, we used six images with resolution 150×200, shown in Fig. 4. Three templates with different window size from each image are selected and original images blurred with a 3×3 window. Mutual information was maximized in terms of translation in two directions. The optimization procedure is performed by powell method [11]. For evaluating the performance of these two methods, the error is calculated as :

22 )()( rTrT yyxxError −+−= (22)

Figure 4. Used data set for simulation.

where ),( TT yx is the coordinate of the center of selected

template and ),( rr yx is the coordinate of the center of resulted sub-image by each method. In other words, error is the Euclidean distance between center of template and extracted sub-image. We selected three templates with 9×9, 7×7 and 5×5 window size. Results of this simulation are listed in table I. It

can be seen that MNP shows better performance and less error with respect to NP method for window size 7×7 and5×5. But both the methods have same result when the window size is 9×9. Variation of error in each case is due to local optima in the surface of objective function (mutual information). Therefore, less optima near ground truth results in smaller error value. Converging to wrong positions is more likely when the number of maximums on the surface of objective function is large. For more investigation, in Fig.5 the mutual information function is shown in terms of translation in x direction. Mutual information function varies smoothly with no optima near the ground truth for both methods when the size of windows is 9×9(see Fig. 5.a), but when the window size is smaller (5×5) the resulted surface by MNP method is more smooth with few local optima (see Fig. 5.b). This confirms the result of previous simulation.

V. CONCLUSION In this paper we used a triangle distribution (instead of the

uniform distribution) to estimate the joint probability distribution of two images with Non-parametric window method. Experimental results show that the performance of template matching is improved for small template sizes (e.g. 5×5).

TABLE I. RESULTED ERRORS FOR NP AND MNP METHODS BY THREE TEMPLATES WITH DIFFERENT SIZEE

Size of

window

Err in image

1

Err in image

2

Err in image

3

Err in image

4

Err in image

5

Err in image

6

NP

9×9 .001 .002 .001 .001 .013 .003

7×7 .14 .19 .2 .09 .24 .08

5×5 .24 .27 .43 .36 .357 .29

MNP

9×9 .012 .003 .001 .000 .003 .002

7×7 .11 .09 .21 .07 .12 .23

5×5 .13 .1 .17 .21 .14 .236

Figure 5. Mutual information in terms of translation in x direction a.window size 9×9. b.window size 5×5

REFERENCES

[1] R. Brunelli, Template Matching Techniques in Computer Vision: Theory and Practice, Wiley, 2009

[2] Ramtin , S., Parastoo , S., Rodney A. Kennedy, and Richard I. H.,”A Survey of Medical Image Registration on Multicore and the GPU”,. IEEE SIGNAL PROCESSING

[3] Mase, F., Collignon, A., Vandermeoulen, D., Marchal, G., and Suet , D.,”Multimodality imag registration by maximization of mutual information”, IEEE Transaction on Medical Imaging,vol.16,pp. 187-198,Apr.1997.

[4] N. Dowson and R. Bowden. Simultaneous modelling and tracking (smat) of feature sets. volume , pages 99–105, San Diego, CA, USA, June 2005.

[5] P. Thevenaz and M. Unser. Optimization of mutual information for multi-resolution image registration. IEEE Trans. On Image Processing, 9(12):2083–2099, December 2000

[6] H. Chen and P. Varshney. Mutual information-based CT-MR brain image registration sing generalised partial volume joint histogram estimation. IEEE. Trans. Medical Imaging, 22(9):1111–1119, September 2003

[7] T. Kadir and M. Brady. Estimating statistics in arbitrary regions of interest. In Proc. British Machine Vision Conf., volume 2, pages 589–598,Oxford,2005, September 2005.

[8] Nicholas , D., Timor , K. and Richard, B.,”Estimating the Joint Statistics of Images Using Nonparametric Windows with Application to Registration Using Mutual Information”, IEEE transaction on pattern analisiss and machineintelegence ,vol. 30, no.10,october 2008.

[9] P. Viola andW.M.Wells III, Alignment by maximization of mutual information, in International Conference on Computer Vision (E. Grimson, S. Shafer, A. Blake, and K. Sugihara, Eds.), pp. 16–23, IEEE Computer Society Press, Los Alamitos, CA, 1995.

[10] A. Papoulis and S.U. Pillai, Probability, Random Variables, and Stochastic Processes, third ed., pp. 124-148. McGraw-Hill, 1991.

[11] W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipesian C, second. Cambridge Univ. Press 992.

[ieee 2011 7th iranian conference on machine vision and image processing (mvip) - tehran, iran...

Documents