biologically inspired particle filter for robust visual ... · biologically inspired particle...

11
Biologically Inspired Particle Filter for Robust Visual Tracking 1 Mingqing Zhu, Chenbin Zhang, Zonghai Chen Department of Automation, University of Science and Technology of China, Hefei, 230027, P.R.China Correspondent: Zonghai Chen, E-mail: [email protected] Abstract Although particle filter and its variants like KPF and UPF achieve great success in many visual tracking applications, they depend on local proposal distributions and hence always fall flat in cases of global object shift and large-scale object movement. To address this issue, we present the concept of global proposal distribution for particle filter with the inspiration from biological vision examples and term their combination as biologically inspired particle filter (BI-PF). Three types of global proposal distributions are proposed based on biological vision inspired saliency maps, including motion, color, and contrast saliency map. In order for BI-PF to mimic the way human’s eye tracks object, local and global proposal distribution are dynamically integrated through a threshold determining when to switch between them. Experimental results demonstrate the outperformance of our BI-PFs over particle filter, KPF, and UPF. Keywards: Visual Tracking, Particle Filter, Proposal Distribution, Biological Vision, Saliency Map 1. Introduction Visual object tracking has drawn significant attention in the field of computer vision due to its enormous worth in applications including visual surveillance [1], human-computer interaction [2], intelligent transportation [3], robot navigation [4], and etc. The heart of solving visual object tracking problem consists in answering two questions: what does the object look like (feature level) and where is the object (tracking level). On the one hand, much work has been carried out on building up reliable object feature model, such as multi-cue integration [5,6], multi-model [7,8], adaptive feature model [9,10], 3-D model [11,12], and feature imagination [13]. On the other hand, a wealth of researches have been conducted on tracking algorithms for robustly following objects, such as CamShift [14,15], MeanShift [16,17,30], Kalman filter [18,19], Particle filter [20,21], and their variants. Among these algorithms, particle filter, known as sequential Monte Carlo approach, exhibits its outstanding capability in tackling non-linear and/or non-Gaussian reality of visual tracking. The basic idea of particle filter (PF) is to approximate posterior density through a set of weighted samples (called particles). Since the particles are drawn from a proposal distribution, the quality of the proposal distribution has a significant impact on the performance of PF. The conventional particle filter uses state transition prior as its proposal distribution, leading to poor performance since the weights of most particles are low if the state transition distribution lies in the tail of posterior density distribution. Moreover, Kalman particle filter (KPF) [22,23] and unscented particle filter (UPF) [23-25,31] have been developed to exploit suboptimal proposal distributions to improve particle filter’s performance. They apply Kalman filter (in KPF) or unscented Kalman filter (in UPF) by taking into account the newest observations to construct suboptimal proposal distribution for approximating posterior density distribution. However, both state transition prior and suboptimal proposal distributions are local proposal distributions clustering in a limited area, only suitable for the case when they have certain overlaps with posterior density distribution. Therefore, these algorithms will not work when object globally shifts its position due to reappearance after fully occlusion or large-scale movement. Besides the computer vision systems, there are many tracking examples shown by biological vision. For one instance, a frog can sensitively respond to global variant area of intensity in field of view (FOV) and hunts insects towards this area [26]. For another instance, when a person is staring at an 1 This work was partially supported by the National Natural Science Foundation of China under Grant 61075073. Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen International Journal of Information Processing and Management(IJIPM) Volume3, Number1. January 2012 doi:10.4156/ijipm.vol3.issue1.6

Upload: lykhanh

Post on 27-May-2019

223 views

Category:

Documents


0 download

TRANSCRIPT

Biologically Inspired Particle Filter for Robust Visual Tracking1

Mingqing Zhu, Chenbin Zhang, Zonghai Chen Department of Automation, University of Science and Technology of China,

Hefei, 230027, P.R.China Correspondent: Zonghai Chen, E-mail: [email protected]

Abstract

Although particle filter and its variants like KPF and UPF achieve great success in many visual tracking applications, they depend on local proposal distributions and hence always fall flat in cases of global object shift and large-scale object movement. To address this issue, we present the concept of global proposal distribution for particle filter with the inspiration from biological vision examples and term their combination as biologically inspired particle filter (BI-PF). Three types of global proposal distributions are proposed based on biological vision inspired saliency maps, including motion, color, and contrast saliency map. In order for BI-PF to mimic the way human’s eye tracks object, local and global proposal distribution are dynamically integrated through a threshold determining when to switch between them. Experimental results demonstrate the outperformance of our BI-PFs over particle filter, KPF, and UPF.

Keywards: Visual Tracking, Particle Filter, Proposal Distribution, Biological Vision,

Saliency Map 1. Introduction

Visual object tracking has drawn significant attention in the field of computer vision due to its enormous worth in applications including visual surveillance [1], human-computer interaction [2], intelligent transportation [3], robot navigation [4], and etc. The heart of solving visual object tracking problem consists in answering two questions: what does the object look like (feature level) and where is the object (tracking level). On the one hand, much work has been carried out on building up reliable object feature model, such as multi-cue integration [5,6], multi-model [7,8], adaptive feature model [9,10], 3-D model [11,12], and feature imagination [13]. On the other hand, a wealth of researches have been conducted on tracking algorithms for robustly following objects, such as CamShift [14,15], MeanShift [16,17,30], Kalman filter [18,19], Particle filter [20,21], and their variants. Among these algorithms, particle filter, known as sequential Monte Carlo approach, exhibits its outstanding capability in tackling non-linear and/or non-Gaussian reality of visual tracking.

The basic idea of particle filter (PF) is to approximate posterior density through a set of weighted samples (called particles). Since the particles are drawn from a proposal distribution, the quality of the proposal distribution has a significant impact on the performance of PF. The conventional particle filter uses state transition prior as its proposal distribution, leading to poor performance since the weights of most particles are low if the state transition distribution lies in the tail of posterior density distribution. Moreover, Kalman particle filter (KPF) [22,23] and unscented particle filter (UPF) [23-25,31] have been developed to exploit suboptimal proposal distributions to improve particle filter’s performance. They apply Kalman filter (in KPF) or unscented Kalman filter (in UPF) by taking into account the newest observations to construct suboptimal proposal distribution for approximating posterior density distribution. However, both state transition prior and suboptimal proposal distributions are local proposal distributions clustering in a limited area, only suitable for the case when they have certain overlaps with posterior density distribution. Therefore, these algorithms will not work when object globally shifts its position due to reappearance after fully occlusion or large-scale movement.

Besides the computer vision systems, there are many tracking examples shown by biological vision. For one instance, a frog can sensitively respond to global variant area of intensity in field of view (FOV) and hunts insects towards this area [26]. For another instance, when a person is staring at an

1 This work was partially supported by the National Natural Science Foundation of China under Grant 61075073.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

International Journal of Information Processing and Management(IJIPM) Volume3, Number1. January 2012 doi:10.4156/ijipm.vol3.issue1.6

object, if the object is missed from his eye’s focus area, he will rapidly search for it in global salient area of FOV with similar color to that of the object. Besides, without prior knowledge of the object, one’s visual attention will be directed towards different locations in a saliency map of the whole FOV in order of decreasing saliency [27, 28]. And a bottom-up, saliency-driven visual attention model has been developed by Itti et al. [27, 29] through exploring the contrasts in color, intensity, and orientation of images, which are combined into a single topographical visual saliency map.

From above biological vision examples, we obtained an inspiration to design a tracking method in the particle filter (PF) framework by imitating the way human’s eye tracks object to deal with not only local tracking tasks but also global ones. The inspiration is that when local tracking could not satisfy the tracking requirement, a prompt global searching should be performed. To this end, since already with the existence of local proposal distribution for PF, we propose the concept of global proposal distribution. And there are in all three types of biological vision inspired saliency maps to be designed as global proposal distributions, each of which is integrated into a PF to form a biologically inspired particle filter (BI-PF). The global proposal distribution is dynamically coupled with the local one via a threshold, called minimum acceptance degree, which controls the switch between them. Experimental results on global object shift and large-scale object movement by BI-PF compared with PF, KPF, and UPF were discussed. In our opinion, the idea of BI-PF will open up a wide range of biological vision techniques that can be adapted for use in computational visual tracking.

The rest of the paper is organized as follows. Section 2 provides a brief description on particle filter. In Section 3, we propose three kinds of global proposal distributions based on biological vision inspired saliency maps. Biologically inspired particle filter algorithm is described in Section 4. Examples are presented in Section 5 to illustrate how combining global proposal distribution with local one facilitates adaptation to global object tracking and large-scale object movement. Section 6 draws the conclusion. 2. Particle Filter

As a sequential Monte Carlo approach, particle filter is to deal with the recursive Bayesian filtering problem by incorporating the Monte Carlo sampling techniques into Bayesian inference. It provides an effective solution to the non-linear and non-Gaussian issue of visual tracking. Its state transition and observation equations are given by

1 1( , )k k kf - -=x x u (1)

( , )k k kh=z x v (2)

where kx is the state vector, kz is the observation vector, 1k-u is an i.i.d. process noise vector, and kv

is an i.i.d. observation noise vector. For visual tracking, Eqs.(1) and (2) are non-linear. The purpose is

to estimate the posterior density 1:( | )k kp x z obtained as

1: 1 1 1 1: 1 1( | ) ( | ) ( | )dk k k k k k kp p p- - - - -=òx z x x x z x (3)

1: 11:

1: 1

( | ) ( | )( | )

( | ) ( | )dk k k k

k k

k k k k k

p pp

p p-

-

z x x zx z

z x x z x (4)

The key idea of particle filter is to approximate posterior density through a set of randomly sampled

particles with associated weights ( ) ( )

1,...,{ , }i i

k k i Nw =x . In practice, the particles are drawn from a proposal

distribution ( ) ( ) ( ) ( )

1~ ( | , )i i i i

k k k kp -x x x z which is always easy to be sampled from. Then the weight of

each particle is computed as ( ) ( ) ( ) ( )

( ) ( ) ( )11( ) ( ) ( )

1

( | ) ( | )ˆ

( | , )

i i i ii i ik k k k

k k ki i i

k k k

p pw w w

p-

-

-

µ =z x x x

x x z (5)

Eq.(5) is the particle weight update equation whereby the normalized particle weight ( ) ( )ˆ /i i

k kw w= ( )

N j

kjw

=å . Probability ( ) ( )( | )i i

k kp z x and ( ) ( )

1( | )i i

k kp -x x denote the observation

likelihood and the state transition prior, respectively. Subsequently, the estimate of the object state is

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

determined by ( ) ( )

N i i

k k kiw

==åx x . At the end of each iteration, in order to overcome the common

problem of sample degeneracy, resampling technique draws N new samples with replacement by

selecting a particular sample with probability ( ) ( )( | )i i

k kp z x .

3. Global Proposal Distribution

Proposal distribution plays a considerable role in particle filter (PF). Since particles exploited to estimate object state are drawn from proposal distribution, the quality of proposal distribution has a significant impact on the performance of PF. When an object moves smoothly and does not shift its position globally in the whole image, state transition prior adopted by PF and suboptimal proposal distribution adopted by its variants such as KPF and UPF can meet the tracking requirement. However, both state transition prior and suboptimal proposal distribution, also counted as local proposal distributions, cluster only in local limited areas, usually resulting in tracking failures in cases of global object shift and large-scale object movement. To cope with this problem, we considered some biological vision tracking examples and derived an inspiration to present the concept of global proposal distribution for PF for visual tracking.

We define global proposal distribution as an area distributed throughout the whole FOV where there is more probability to find the object. Therefore, under conditions that local tracking is not competent, global tracking will be implemented by seeking for the object within global proposal distribution. In the following of this section, three kinds of global proposal distributions are proposed by using biological vision inspired saliency maps each of which is inspired by a biological vision mechanism.

3.1. Motion Saliency Map

Motion of field of view (FOV) contributes to induce prominent attention for biological visual system. For example, a frog does not pay any attention to the details of stationary parts of the world, causing its starvation to death even though surrounded by motionless food. However, it is sensitive to global variant area of intensity of FOV that stimulates its vision nerve whereby hunting towards this area [26]. Motivated by this attribute of frog’s eye, we treat motion saliency map as global proposal distribution. For simplicity, the motion saliency map only relies on frame difference, which reflects

variation of FOV and costs less processing time. Let kF be the input color image at frame k and kI

be its intensity image in size of m n´ pixels. We compute motion saliency map kMSM as

1| |k k k-= -MSM I I (6)

In practice, we further binarize kMSM with an adaptive threshold T to achieve a binary image as

exact global sampling region, where T is adaptively calculated according to a roughly specified ratio r of object area to that of the whole image. Let P be the integer value of 100 r . At every time step,

we save kMSM in a vector V and sort its elements in descending order to get the vector 'V . Then

the value of P -th element of 'V is taken as the threshold T . Then, a binary image kbiMSM is

created by replacing all values in kMSM above T with 1 and others with 0. In global tracking

process, particles are randomly sampled in area of kbiMSM with values equal to 1.

3.2. Color Saliency Map

Prior knowledge of object such as color feature can steer human’s attention to seek for the object in FOV in a top-down, volition-controlled, and task-driven manner. If an object disappears from a person eye’s fovea, he can globally look for it in salient area of FOV that has similar color to the object. Gaining an intuition from this property, we build a global proposal distribution on the basis of object’s color histogram feature. Specifically, we propose a discriminative color feature extraction method so as

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

to enhance robustness against background disturbance by considering both of object features and background ones.

Supposing the object color histogram 1,...,{ ( )}obj j up j ==H is computed in an inner rectangle of

sizes a b pixels, the background color histogram 1,...,{ ( )}bg j uq j ==H is acquired in an outer margin

of width 0.5 max( , )a b´ pixels surrounding the object rectangle, where u is the number of

histogram bins. The log likelihood histogram L of the object is expressed by max{ ( ), }

( ) logmax{ ( ), }

p ji

q j

dd

=L (7)

where d is a small constant to avoid dividing by zero or taking the log of zero. Each bin value

( )jL represents the weight of j-th bin in discriminating object from background. Therefore, we take

L as weight vector to calculate discriminative color histogram objdisH for object shown as

( ) ( ) ( )obj j p j L j= ´disH (8)

Subsequently, objdisH at time k is mapped into FOV image to yield a color saliency map kCSM

used as global proposal distribution, followed by the same binarization step as that of motion saliency map introduced in the last paragraph of Section 3.1 to obtain a binary image in which to draw particles.

3.3. Contrast Saliency Map

One of the earliest works in human visual attention detection is based on neurophysiology of visual processing of the early primate visual system that is described by Itti et al. [27-29]. They proposed a bottom-up, saliency-driven visual attention model by adopting the contrasts in color, intensity and orientation of images that are integrated to form a single topographical saliency map. Human visual attention will always focus on the salient locations in the saliency map of FOV. Inspired by this mechanism of visual attention, we also present a global proposal distribution by using the low-level contrasts based saliency map of Itti et al.[27], which is freely available.

The first step is to decompose the input image kF into a set of distinct feature maps, including

red/green and blue/yellow color contrasts, intensity contrast, and orientation contrast [27]. Center-surround operations are executed as differences between a fine and a coarse scale of a given feature in the form of Gaussian pyramids. The center of the receptive field is given by level {2,3,4}cÎ in the

pyramid and the surround by level s c s= + , where {2,3}s Î . Let ( )cF denote a feature map at

scale c and ( )sF denote its surround feature map at scale s . The across-scale subtraction map is

defined as ( , ) | ( ) ( ) |c s c s= QF F F (9)

where “ Q ” represents across-scale subtraction derived from interpolation to the finer scale and

pointwise subtraction. After that, every across-scale subtraction map ( , )c sF is operated by a map

normalization step, (.)N , which promotes maps where strong peaks are present and suppresses maps

that contain homogeneous areas. Then all the normalized feature maps of the same type like intensity,

color, and orientation are summed into three conspicuity maps, I for intensity, C for color, and O

for orientation, at the same scale 4c = calculated as 4 3

2 2( ( , ))

c

c s cN c s

+

= = +=Å ÅF F (10)

where F denotes I , C , or O . Finally, the contrast saliency map kConSM is attained by

normalizing the three conspicuity maps and summing them up as

( )1( ) ( ) ( )

3k N N N= + +ConSM I C O (11)

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

Here, the contrast saliency map kConSM is employed as a global proposal distribution. Likewise, a

binary image from this saliency map is extracted as global sampling area by the same way as described in Section 3.1. 4. Biologically Inspired Particle Filter

In the previous section, three types of global proposal distributions based on biological vision inspired saliency maps are proposed. In this section, we combine each of them with the existing local proposal distribution of PF to form a new tracking algorithm, called biologically inspired particle filter (BI-PF). Incidentally, state transition prior of PF acts as local proposal distribution in our work.

The key intuition of BI-PF relies on the phenomenon that, when you are looking at an object, if the object disappears from your eye’s focus area, you will search for it in global salient area of FOV until recovering it. It means a tracking process with dynamic switch between local and global searching over time. To simulate human’s feeling of disappearance of object from eye’s focus area, a parameter, termed as minimum acceptance degree (MAD), is viewed as a threshold to judge whether the object has been lost from eye’s focus area. If the similarity between eye’s focus area and the object is larger than MAD, then one believes the object still exists in current local focus area, otherwise believes the object has left away and turns to seeking for it in global proposal distribution. That is to say, MAD is the threshold for dominating the dynamic switch between local and global tracking.

In our work, let object state be given by T[ , , , , , ]x y

k k k k k k kx y x y H H=x (12)

where, ( , )k kx y is object position, ( , )k kx y is object velocity, x

kH and y

kH are object sizes in X-

axis and Y-axis directions, respectively. We use first-order autoregressive model 1 1k k k- -= +x Ax u

as state transition equation, where A is state transition matrix. The state of each particle specifies a unique rectangular region. Color histogram feature model is applied as feature model within HSV color

space. Object feature model is denoted as 1,...,{ ( )}obj j up j ==H , and feature model of each particle ( )i

kx as ( )

1,...,{ ( )}i

k j uv j ==H . Bhattacharyy coefficient which is utilized for measuring the similarity

between object feature model objH and each particle’s feature model ( )i

kH is expressed as

( )

1[ , ] ( ) ( )

ui

k obj jv j p jr

==åH H (13)

And Gaussian observation probability ( )( | )i

k kp z x is calculated as ( )

( )

2

1 [ , ]1( | ) exp

22

i

k obji

k kpr

sps

-ì ü= -í ý

î þ

H Hz x (14)

where s is standard deviation of the Gaussian density.

Algorithm 1 Biologically Inspired Particle Filter at time k.

1: if 1objLost =

2: Construct global proposal distribution such as kMSM , or kCSM , or kConSM

3: Draw N particles ( )i

kx in global proposal distribution

4: Calculate observation degrees of the particles ( ) ( ) ( )

1ˆ ( | )i i i

k k k kw p w -= z x

5: Make N copies of the particle ( )

k

qx with the maximum observation degree ( )ˆkw q = ( )

1ˆmax i

ki Nw

££, use them as sample set, and assign N/1 to each particle weight

6: end if

7: Draw N new particles ( )i

kx in terms of state transition equation

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

8: Calculate observation degrees of the particles ( ) ( ) ( )

1ˆ ( | )i i i

k k k kw p w -= z x

9: if maximum observation degree ( )ˆkw q = ( )

1ˆmax i

ki Nw

££ > MAD then

10: Believe the object has been tracked and set objLost to 0

11: Calculate particle weights ( ) ( )ˆ /i i

k kw w= ( )

N j

kjw

12: A weighted average of the particles is used to calculate state estimation of the object as

( ) ( )

N i i

k k kiw

==åx x

13: Resample particles ( )i

kx with probability )(ikw and set Nw i

k /1)( =

14: else

15: Believe the object has been lost and set objLost to 1

16: end if and go back to Step 1

The principle of BI-PF is depicted in Algorithm 1. The unnormalized weight ( )ˆ i

kw in Eq.(5) is

treated as observation degree which means the similarity between particle feature model ( )i

kH and

object feature model objH . In initialization stage, supposing the object has been tracked( objLost was

initialized to 0), we manually initialize its rectangular region to figure out its feature model objH and

sampled N particles around the initial object state each with the same weight as 1/ N . Let time

1k = and Algorithm 1 starts at Line 7. In Line 9, one makes a decision whether the object has been

lost from local area. If current maximum observation degree ( )ˆkw q = ( )

1ˆmax i

ki Nw

££ exceeds MAD, one will

still focus attention around current area (local proposal distribution) to track the object (Line 10 to 12), otherwise, he will consider the object has moved far away from focus (Line 15) and search for it in global proposal distribution of FOV (Line 2 to 5). The coarse object state derived from global tracking

is located by the particle ( )

k

qx with maximum observation degree ( )ˆkw q = ( )

1ˆmax i

ki Nw

££(Line 5), followed

by a local tracking around it to detect the fine state (Line 7 to 12). Note that, the global proposal distribution in Line 2 can be one of the three biological vision inspired saliency maps proposed in Section 3. 5. Experimental Results

In order to illustrate the performance of the proposed biologically inspired particle filters (BI-PFs), here we present two challenging tracking examples. According to the types of biological vision information used, these three BI-PFs are termed as motion saliency map based particle filter (MSM-PF), color saliency map based particle filter (CSM-PF), and contrast saliency map based particle filter (ConSM-PF), respectively. Experimental results on global object tracking and large-scale object movement of these BI-PFs compared with conventional particle filter (PF), Kalman particle filter (KPF), and unscented particle filter (UPF) are discussed below. The number of particles, N, utilized in these six algorithms is fixed to 100, and the minimum acceptance degree (MAD) is set to 0.02. In what follows, maximum observation degree is referred to as MOD for simplicity.

To test global object tracking ability, we took a video clip of 170 frames, in which the object being tracked is a man walking around, then coming behind a baffle from one end and staying for some time, finally going out from the other end. The distance between the two ends of the baffle is large up to nearly half of the screen’s width. Several representative tracking frames are shown in Figure 1. Before the object was fully occluded, the six algorithms all locked it successfully. When the object reappeared from the other end, PF, KPF, and UPF got lost in tracking it and went out of range, while MSM-PF, CSM-PF, and ConSM-PF all succeeded in tracking it. Figure 2 displays maximum observation degrees (MOD) of the six algorithms. Before the object disappeared, all the MODs were beyond the minimum acceptance degree 0.02, which indicates local proposal distribution satisfied the tracking requirement.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

From Frame 131 to 149 during which the object was wholly occluded, all the MODs were smaller than 0.02. In this case, PF, KPF, and UPF still searched for the object via local proposal distributions while MSM-PF, CSM-PF, and ConSM-PF drew particles in global proposal distributions. Some of the particles fell into the object area when it walked out at Frame 150, leading the MODs of MSM-PF, CSM-PF, and ConSM-PF to exceed 0.02, then the object was re-tracked and global searching was switched to local tracking at Frame 151. On the contrary, without global searching mechanism, PF, KPF, and UPF totally dependent of local proposal distributions eventually failed to track the object.

Figure 1. Results of global object tracking. From left to right column: Frame 130, 135, and 150. The

object is fully occluded behind the baffle during Frame 131 to 149. The second rows of MSM-PF, CSM-PF, and ConSM-PF are corresponding global proposal distributions. The red box denotes the

coarse state, and the green box the fine state.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

Figure 2. Maximum observation degree

To analyze robustness against large-scale object movement, we made experiments on a video clip in which one person quickly swings his body forth and back with large displacement between consecutive frames. Here, the object is his face. Three successive tracking results are presented in Figure 3 to demonstrate BI-PFs’ performance and illustrate their tracking principle. Between Frame 14 and 15, the object moved for a long distance. PF, KPF, and UPF were unable to perceive this phenomenon and continued to sample particles based on local proposal distributions so that their positioning errors gradually became bigger (Frame 15). On the contrary, at Frame 15, MSM-PF, CSM-PF, and ConSM-PF discarded current tracking results in that their MODs were less than minimum acceptance degree 0.02, so they believed the object was lost and prepared to seek for it in global proposal distributions in the next frame. At Frame 16, PF, KPF, and UPF nearly lost the object because of local tracking in biased proposal distribution, while MSM-PF, CSM-PF, and ConSM-PF coarsely located the area (the red box) nearby the object through global searching and finely detected the object (the green box) around the particle with MOD. It is obvious that the results of these BI-PFs are all superior to those of PF, KPF, and UPF. We manually set up the ground truth object center location of the sequence and provide average positioning errors (APE) of the six algorithms in Table 1. As can be seen, APEs of these BI-PFs are also better than those of PF and its variants.

Successful tracking rate (STR) is also computed as performance metric which is defined by the ratio of the number of frames the object is accurately estimated to the total number of frames in the sequence. We consider the tracking successful if the tracking result is within the 8 * 8 neighborhood of the ground truth object center location. Table 2 shows STRs of the six algorithms in the above two tracking sequences. For the sequence of global object tracking (Figure 1), it should be noted that in approximately 19/170 of the frames the objects were entirely occluded, therefore the overall STR was bounded at 151/170. Because PF, KPF, and UPF could merely follow the object before it was fully occluded in the former 130 frames, their STRs are identical to 130/170. In comparison, MSM-PF, CSM-PF, and ConSM-PF accurately detected the object all over the frames before the object disappeared and after it reappeared. For the sequence of large-scale object movement (Figure 3), despite of some frames in which the BI-PFs dropped inaccurate tracking results, their STRs are almost two times those of PF and its variants.

Note that, the results of these three BI-PFs in above experiments are similar mutually. The reason may be that they are different from each other only in types of global proposal distributions but akin in all other aspects.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

Figure 3. Results of large-scale object movement. From left to right column: Frame 14, 15, and 16.

The second rows of MSM-PF, CSM-PF, and ConSM-PF are corresponding global proposal distributions. The red box denotes the coarse state, and the green box the fine state.

Table 1: Average Positioning Error (APE) PF KPF UPF MSM-PF CSM-PF ConSM-PF

APE (Pixel) 10.64 11.24 9.96 7.66 8.91 6.97

Table 2: Successful Tracking Rate (STR).

A: sequence of global object tracking; B: sequence of large-scale object movement. STR of A STR of B

PF 130/170 7/29 KPF 130/170 10/29 UPF 130/170 8/29

MSM-PF 151/170 19/29 CSM-PF 151/170 20/29

ConSM-PF 151/170 20/29

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

6. Conclusion

We presented three kinds of biologically inspired particle filter (BI-PF) for visual tracking by mimicking the way human’s eye tracks object based on a dynamic hybrid of local and global proposal distribution. Three types of biological vision inspired saliency maps were proposed to serve as global proposal distributions, each of which was incorporated into a particle filter to form a BI-PF. As a consequence of dynamic combining global proposal distribution with the local one via a threshold, BI-PF not only inherits the superiority in particle filter but also performs well in the challenging scenarios of global object transfer and large-scale object movement, demonstrated in two groups of comparative experiments with PF, KPF, and UPF. In our opinion, the idea to take advantage of biological vision mechanism will open up an arsenal of biological vision techniques that can facilitate computational visual tracking. Worthy to be mentioned here, although we have applied three kinds of saliency maps as global proposal distributions, yet more sophisticated saliency maps can be explored to further improve the tracking performance. 7. References [1] H. M. Dee, S. A. Velastin, “How close are we to solving the problem of automated visual

surveillance? ”, Machine Vision and Applications, Springer, vol.19, no.5-6, pp.329-343, 2007. [2] K. Nickel, R. Stiefelhagen, “Visual recognition of pointing gestures for human–robot interaction”,

Image and Vision Computing, Elsevier, vol.25, no.12, pp.1875-1884, 2007. [3] J. Klein, C. Lecomte, and P. Miche, “Preceding car tracking using belief functions and a particle

filter”, In Proceedings of the 19th International Conference on Pattern Recognition, pp.1-4, 2008. [4] L. Mejias, P. Campoy, et al., “A visual servoing approach for tracking features in urban areas

using an autonomous helicopter”, In Proceedings of the 2006 IEEE International Conference on Robotics and Automation, pp.2503-2508, 2006.

[5] J. Q. Wang, Y. Yagi, “Integrating color and shape-texture features for adaptive real-time object tracking”, IEEE Transactions on Image Processing, IEEE, vol.17, no.2, pp.235-240, 2008.

[6] E. Maggio, F. Smerladi, and A. Cavallaro, “Adaptive multifeature tracking in a particle filtering framework”, IEEE Transactions on Circuits and Systems for Video Technology, IEEE, vol.17, no.10, pp.1348-1359, 2007.

[7] Y. Zhai, M. Yeary. “An object tracking algorithm based on multi-model and multi-measurement cues”, In Proceedings of 2010 IEEE International Instrumentation and Measurement Technology Conference, pp.805-808, 2010.

[8] B. Georgescu, D. Comaniciu, et al., “Multi-model component-based tracking using robust information fusion”, In Proceedings of ECCV Workshop SMVP, pp.61-70, 2004.

[9] R. T. Collins, Y. X. Liu, and M. Leordeanu, “Online selection of discriminative tracking features”,IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, vol.27, no.10, pp.1631-1643, 2005.

[10] K. Hariharakrishnan, D. Schonfeld, “Fast object tracking using adaptive block matching”, IEEE Transactions on Multimedia, IEEE, vol.7, no.5, pp.853-859, 2005.

[11] R. Urtasun, D. J. Fleet, and P. Fua, “Temporal motion models for monocular and multiview 3D human body tracking”, Computer Vision and Image Understanding, Elsevier, vol.104, no.2-3, pp.157-177, 2006.

[12] D. M. Gavrila, L. S. Davis, “3-D model-based tracking of humans in action: A multi-view approach”, In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp.73-80, 1996.

[13] X. X. Xu, Z. L. Wang, and Z. H. Chen, “Visual tracking model based on feature-imagination and its application”, In Proceedings of 2010 International Conference on Multimedia Information Networking and Security, pp.370-374, 2010.

[14] H. X. Chu, S. J. Ye, et al., “Object tracking algorithm based on camshift algorithm combinating with difference in frame”, In Proceedings of 2007 IEEE International Conference on Automation and Logistics, pp.51-55, 2007.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen

[15] R. Stolkin, I. Florescu, and G. Kamberov, “An adaptive background model for CAMSHIFT tracking with a moving camera”, In Proceedings of the Sixth International Conference on Advances in Pattern Recognition, pp.147-151, 2007.

[16] D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of non-rigid objects using mean shift”, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp.142-149, 2000.

[17] R. T. Collins, “Mean-shift blob tracking through scale space”, In Proceedings of 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol.2, pp. 234-240, 2003.

[18] Y. Yoon, A. Kosaka, and A. C. Kak, “A new Kalman-filter-based framework for fast and accurate visual tracking of rigid objects”, IEEE Transactions on Robotics, IEEE, vol.24, no.5, pp.1238-1251, 2008.

[19] I. J. Ndiour, P. A. Vela. “Towards a local Kalman filter for visual tracking”, In Proceedings of the 48th IEEE Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference, pp.2420-2426, 2009.

[20] M. Isard, A. Blake, “CONDENSATION - Conditional density propagation for visual tracking”, International Journal of Computer Vision, Kluwer Academic Publishers, vol.29, no.1, pp.5-28, 1998.

[21] K. Nummiaro, E. Koller-Meier, and L. Van Gool, “An adaptive color-based particle filter”, Image and Vision Computing, Elsevier, vol.21, no.1, pp.99-110, 2003.

[22] Y. Satoh, T. Okatani, and K. Deguchi, “A color-based tracking by Kalman particle filter”, In Proceedings of the 17th International Conference on Pattern Recognition, vol.3, pp.502-505, 2004.

[23] P. H. Li, T. W. Zhang, and Arthur E. C. Pece, “Visual contour tracking based on particle filters”, Image and Vision Computing, Elsevier, vol.21, no.1, pp.111-123, 2003.

[24] Y. Rui, Y. Q. Chen, “Better Proposal Distributions: object tracking using unscented particle filter”, In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vo.2, pp.786-793, 2001.

[25] X. Q. Zhang, W. M. Hu, et al., “SVD based Kalman particle filter for robust visual tracking”, In Proceedings of the 19th International Conference on Pattern Recognition, pp.1-4, 2008.

[26] J. Lettvin, H. Maturana, W. McCulloch and W. Pitts, “What the frog's eye tells the frog's brain”, In Proceedings of the Institute of Radio Engineers, vol.49, pp.1940-1951, 1959.

[27] L. Itti, C. Koch, and E. Niebur, “A model of saliency-based visual attention for rapid scene analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, vol.20, no.11, pp.1254-1259, 1998.

[28] L. Itti, N. Dhavale, and F. Pighin, “Realistic Avatar eye and head animation using a neurobiological model of visual attention”, In Proceeding of SPIE 48th Annual International Symposium on Optical Science and Technology, pp.64-78, 2003.

[29] L. Itti, C. Koch, “Computational modeling of visual attention”, Nature Reviews-Neuroscience, Nature, vol.2, no.3, pp:194-203, 2001.

[30] Z. Q. Cheng, Y. Z. Ma, J. Bu, “Mean shifts identification model in bivariate process based on LS-SVM pattern recognizer”, JDCTA:International Journal of Digital Content Technology and its Applications, AICIT, vol.4, no. 3, pp.154-170, 2010.

[31] Q. T. Wang, X. L. Hu, “New Kalman filtering algorithm for Passive-BD/SINS integrated navigation system based on UKF”, JCIT: Journal of Convergence Information Technology, AICIT, vol.4, no.2, pp.102-107, 2009.

Biologically Inspired Particle Filter for Robust Visual Tracking Mingqing Zhu, Chenbin Zhang, Zonghai Chen