background modeling in the maritime domainbloisi/papers/background... · e-mail:...

Noname manuscript No.(will be inserted by the editor)

Background Modeling in the Maritime Domain

Domenico D. Bloisi · Andrea Pennisi · Luca Iocchi

Received: date / Accepted: date

Abstract Maritime environment represents a challeng-ing scenario for automatic video surveillance, due to thecomplexity of the observed scene: waves on the water

surface, boat wakes, and weather issues contribute togenerate a highly dynamic background. Moreover, anappropriate background model has to deal with gradual

and sudden illumination changes, camera jitter, shad-ows, and reflections that can provoke false detections.Using a predefined distribution (e.g., Gaussian) for gen-

erating the background model can result ineffective, dueto the need of modeling non-regular patterns. In thispaper, a method for creating a “discretization” of an

unknown distribution that can model highly dynamicbackground such as water is described. A quantitativeevaluation carried out on two publicly available data

sets of videos and images, containing data recordedin different maritime scenarios, with varying light andweather conditions, demonstrates the effectiveness of

the approach.

Keywords Background subtraction · Dynamicbackground · Maritime surveillance · Maritime data

set

Domenico D. BloisiDepartment of Computer, Control, and Management Engi-neering - Sapienza University of Rome, ItalyE-mail: [email protected]

Andrea PennisiDepartment of Computer, Control, and Management Engi-neering - Sapienza University of Rome, ItalyE-mail: [email protected]

Luca IocchiDepartment of Computer, Control, and Management Engi-neering - Sapienza University of Rome, ItalyE-mail: [email protected]

1 Introduction

Background subtraction (BS) is a common approach fordetecting moving objects in video sequences capturedby fixed cameras. Its aim is to operate on the raw video

data, identifying the moving regions (the foreground,FG) by comparing the current frame with a model ofthe scene background (BG). The possibility of achiev-

ing real-time performance made BS appealing for beingthe initial step for a number of higher-level tasks, suchas visual tracking, object recognition, and detection of

abnormal behaviors [7].

Generally, BS includes two stages:

1. Background initialization, where the first model of

the background is generated;2. Model updating, where the information in the BG

model is updated to take into account the changes

in the observed scene.

Since BS is highly dependent on the creation of the BGmodel to be effective, building an accurate model is thekey issue. A series of variable aspects, depending also

on the kind of environment considered, such as illumi-nation changes (gradual and sudden) [21], shadows [8],camera jitter (due to winds) [12], movement of back-

ground elements (e.g., trees swaying in the breeze) [33],and modifications in the background geometry (e.g.,parked cars) [30], makes it an hard problem to solve.

Maritime domain represents one of the most chal-lenging scenarios for BS due to the complexity of the

monitored scene: waves on the water surface, boat wakes,and weather issues (such as bright sun, fog, heavy rain)contribute to generate a highly dynamic background

(see Fig. 1).

In this paper, we describe a BS method called Inde-

pendent Multimodal Background Subtraction (IMBS)

2 Domenico D. Bloisi et al.

Fig. 1 Challenges in the maritime scenario. Sun reflections,waves on the water surface, and boat wakes contribute togenerate a highly dynamic background. In this figure, thesame scene is observed from three different points of view[19].

that has been specifically designed for dealing with wa-ter background, but can be successfully applied to every

scenario [3]. The algorithm is currently in use within areal 24/7 video surveillance system for the control ofnaval traffic in Venice [4].

The main contributions of the proposed method are:

– An on-line clustering algorithm to capture the mul-timodal nature of the background without maintain-ing a buffer with the previous frames;

– A model update mechanism that can detect changesin the background geometry;

– A noise removal module to help in filtering out false

detections due to reflections and boat wakes.

IMBS has been quantitatively evaluated by using a

large data set of videos [19] coming from a real system[4] that contains data recorded in different scenarios,with varying light and weather conditions. In addition,

a second publicly available data set [5] containing se-quences with water background has been used to fur-ther validate the method. The results of the comparison

with several state-of-the-art algorithms implemented inthe widely used OpenCV library [23] demonstrate theeffectiveness of the approach and its real-time perfor-

mance.

IMBS is released as an open source code1 and all thedata used for the evaluation, as well as additional com-

parisons, can be downloaded from the publicly availableMAR - Maritime Activity Recognition data set [19].

The remainder of the paper is organized as follows.Related work is analysed in the next Section 2, whileour method is presented in Section 3. Section 4 de-

scribes the noise suppression module, designed to fil-ter out false positives generated by shadows, reflections,and boat wakes. Both qualitative and quantitative eval-

uation results are reported in Section 5. Section 6 pro-vides the conclusions.

1 http://www.dis.uniroma1.it/~bloisi/software/imbs.

html Also available in OpenCV in a few time.

2 Related Work

Different classifications of BS methods have been pro-posed in literature. Cristani et al. in [7] divide BS algo-

rithms in four classes:

(i) Per-pixel. The class of per-pixel approaches (e.g.,[8,27]) consider each pixel signal as an independent

process. This class of methods requires the mini-mum computational effort and can achieve real-timeperformance.

(ii) Per-region. Region-based algorithms (e.g., [16,20])usually divide the frames into blocks and calculateblock-specific features to obtain the foreground. In

such a way, a more robust description of the visualappearance of the observed scene can be achieved.Indeed, information coming from a set of pixels can

be used to model parts of the background scenewhich are locally oscillating or moving slightly, likeleafs or flags. Moreover, by considering a region of

the image instead of a single pixel permits to com-pute histograms and to extract edges that can beuseful for filtering out false positives.

(iii) Per-frame. Global changes in the scene can be de-tected only by reasoning at frame-level (e.g., [22,28]). Particular situations that involve the entire

frame, like switching the lights on and off in an in-door scene, cannot be managed without consideringall the pixels in the image.

(iv) Multi-stage.Multi-stage methods (e.g., [34,31]) com-bine the above approaches in a serial process. Mul-tiple steps are performed at different levels, in order

to refine the final result.

Cheung and Kamat in [6] identify two classes of BSmethods, namely recursive and non-recursive.

(i) Recursive algorithms (e.g., [27]) make use of a single

BG model based that is recursively updated on eachnew input frame. As a result, input frames from dis-tant past could have an effect on the current back-ground model [12].

(ii) Non-recursive approaches (e.g., [8,22]) maintain abuffer of previous frames and estimate the BGmodelbased on a statistical analysis of the frames in the

buffer.

A third classification has been proposed by Mittal andParagios in [20]. BS methods are divided into predictiveand non-predictive ones.

(i) Predictive algorithms (e.g., [10]) model the scene as

a time series and develop a dynamical model to re-cover the current input based on past observations.

(ii) Non-predictive techniques (e.g., [27,11]) neglect the

order of the input observations and build a proba-

Background Modeling in the Maritime Domain 3

bilistic representation of the observations at a par-

ticular pixel.

Two approaches specially related to the detection

of reflective surfaces have been proposed by Rankin etal. in [25] and by He and Chu in [15]. Both solutionsare based on the analysis of the HSV color space ofthe images. Indeed, the single components H (hue), S

(saturation), and V (value or brightness) have specificbehaviors that allow to detect in which part of the im-age the reflective surface is.

While there are many general approaches to back-ground modeling in dynamic environments, only a fewof them have been tested on water background, thus a

real-time, complete, and effective solution for maritimescenes does not yet exist. In particular, water back-ground is more difficult than other kinds of dynamic

background since waves on the water surface do notbelong to the foreground even though they involve mo-tion. In addition, sun reflections do not have the same

behavior of a reflective surface.As suggested by Ablavsky in [1], for dealing with

water background it is fundamental to integrate a pixel-

wise statistical model with a global model of the move-ment of the scene. Indeed, the BG model has to be sen-sitive enough to detect the moving objects and to be

adaptive to long-term lighting and structural changes(e.g., objects entering the scene and becoming station-ary).

The spatial dependencies between a pixel and itsneighboring pixels in dynamic background has beenstudied by Zhang et al. in [36–39]. A local dependency

descriptor called LDH (Local Dependency Histogram)can capture spatial dependencies between pixels fromconsecutive frames. Labeling the pixel as foreground or

background is done by comparing the new LDH com-puted in the current frame against its model LDHs.Therefore, for each pixel in an image, an adaptive sam-

ple consisting of pixels observed in previous frames ismaintained. The pixel is considered a background pointif one of the estimated probabilities is larger than a fixed

threshold.Per-pixel approaches (e.g., [27]) typically fail be-

cause the inherent rich and dynamic textures of water

background cause large changes at an individual pixellevel [9]. A non-parametric approach (e.g., [11]) is notable to learn all the changes, since in the water surface

the changes do not present any regular patterns [29].More complex approaches (e.g., [26,43,42]), can obtainbetter results at the cost of increasing the computa-

tional load of the process.In the next section our method, called IMBS, is pre-

sented. It is a multi-stage (per-pixel, per-frame), non-

recursive, non-predictive, and real-time BS algorithm.

Fig. 2 RGB and HSV values of a pixel (white dot) fromframe 7120 to frame 7170 of Jug sequence [17]. The x-axisrepresents the color component values, while on the y-axis thenumber of occurrences are reported. HSV values are scaled tofit to [0, 255].

Fig. 3 RGB and HSV values of a pixel (white dot) fromframe 50 to frame 150 of sequence 1 from the MAR data set[19]. The x-axis represents the color component values, whileon the y-axis the number of occurrences are reported. HSVvalues are scaled to fit to [0, 255].

3 Independent Multimodal BackgroundSubtraction

The analysis of pixel color values from water back-ground highlights the presence of non-regular patterns.

As an example, Fig. 2 shows the RGB and HSV colorhistograms of a pixel over 50 frames from the Jug se-quence [17]. The RGB values (see the three histograms

on the top of Fig. 2) are distributed over a large partof the spectrum [0, 255], e.g, the red values range from105 to 155. Also the HSV values (shown in the three

histograms on the bottom of Fig. 2) cover a large partof the spectrum, in particular the hue values are char-acterized by a strong division in several bins.

A more marked distribution over the whole spec-trum of values can be observed analyzing a pixel froma background region affected by reflections (see Fig. 3).

In such a case, the red values range from 85 to 245.

The two presented examples support the hypothesis

that using a predefined fixed distribution is not a suit-able solution for modeling the strongly varying behaviorof water background. To further validate this hypoth-

esis, the Anderson-Darling test [40] has been appliedto the color values reported in Fig. 2 and Fig. 3, find-ing that in all the cases the hypothesis of normality is

rejected.


Algorithm 1: IMBS

Input: I, P,N,D,A, α, β, τS , τH , ω, θ

Data Structure: B

Output: F

Initialize: n = 0, ts = 0, ∀i, j B(i,j) (ts) = ⊘

foreach I(t) doif (t− ts) > P then

B(t)←BGInitialization(I(t), D,A, n,B(t− 1));ts = t;n = n+ 1;if n = N then

n = 0;

fgmask(t)← FGComputation(I(t),B(t), A);Ffiltered(t)←NoiseSuppression(fgmask(t), α, β, τS , τH , ω, θ);{F (t),B(t+ 1)} ←ModelUpdate(Ffiltered(t),B(t));

In order to cope with water background, a possi-ble solution is to create a discretization of an unknown

distribution by using an on-line clustering mechanism.Each pixel of the background is modeled as a set of binsof variable size, without any a priori assumption about

the observed distribution.

The details of the proposed method are reportedin Alg. 1. I(t) is the current image at time t, B isthe BG model, and ts is the time-stamp of the last

processed scene sample. The input parameters are thesampling period P , the number of scene samples to beanalyzed N , the association threshold A, and the di-

mension threshold D that regulates the dimensions ofthe bins. Moreover, it is possible to use the parame-ters α, β, τS , and τH for filtering out shadows pixels and

ω, θ for managing reflections. A diagram illustrating theIMBS algorithm is reported in Fig. 4.

3.1 Background Creation

The procedure BG − Initialization (see Alg. 2) is re-sponsible of the creation of the BG model, while FG−Computation (see Alg. 3) generates the binary FG im-

age fgmask.

The BG model is computed through a per-pixel, on-line statistical analysis of N frames, in order to achievea high computational speed. Every P milliseconds, the

current frame I is considered to be a scene sample Sk,where 1 ≤ k ≤ N .

Let I(t) be the W ×H input frame at time t, andfgmask(t) the corresponding foreground binary image.The BG modelB is a matrix ofH rows andW columns.

Each element B(i, j) of the matrix is a set of couples

Fig. 4 The modular architecture of the proposed approach.

Algorithm 2: BG Initialization

Input: I,D,A, n

foreach color value v of pixel p ∈ I doif n = 0 then

add couple ⟨v, 1⟩ to B(i, j);else if n = N − 1 then

foreach couple T := ⟨c, f(c)⟩ ∈ B(i, j) doif f(c) < D then

delete T ;

elseforeach couple T := ⟨c, f(c)⟩ ∈ B(i, j) do

if |v − c| ≤ A then

c′ ←⌊c·f(c)+v

f(c+1)

⌋;

T ← ⟨c′, f(c+ 1)⟩;break;

elseadd couple ⟨c, 1⟩ to B(i, j);

⟨c, f(c)⟩, where c is a value in a chosen color space (e.g.,

RGB or HSV) and f(c) → [1, N ] is a function returningthe number of pixels in the scene sample Sk(i, j) asso-ciated with the color component values denoted by c.

It is worth to note that modeling each pixel by bind-ing all the color channels in a single element has theadvantage of capturing the statistical dependence be-tween the color channels, instead of considering each

channel independently.

Every pixel from a scene sample Sk is associatedwith an element of B according to the threshold A (seeAlg. 2). Once the last sample SN has been processed,

if a couple T has a number f(c) of associated sam-ples greater than or equal to the threshold D, namelyT := ⟨c, f(c) ≥ D⟩, then its color value c becomes a

significant background value.

IMBS does not need that at the beginning of theprocessed sequence the scene is without foreground el-ements as other methods require. Considering multiple

samples over time allows for obtaining an accurate BG


Algorithm 3: FG Computation

Input: I,B, A

Initialize: ∀i, j fgmask(i, j) = 1

foreach color value v of pixel p ∈ I doif |B(i, j)| ̸= ⊘ then

foreach couple T := ⟨c, f(c)⟩ ∈ B(i, j) doif |v − c| < A then

fgmask(i, j)← 0;break;

model, even if moving objects are present in the scene.Indeed, moving objects will appear in different positions

in the samples and will affect different elements of theBG model. As a consequence, the BG values related tothose objects will be discarded thanks to the threshold

D.Up to ⌊N/D⌋ couples for each element of B are con-

sidered at the same time, allowing for approximating a

multimodal probability distribution, that can addressthe problem of waves, gradual illumination changes,noise in sensor data, and movement of small background

elements. Furthermore, the adaptive number of couplesfor each pixel can model non-regular patterns that can-not be associated with any predefined distribution (e.g.,

Gaussian). This is a different approximation with re-spect to the popular Mixture of Gaussians approach[27], since the “discretization” of the unknown back-

ground distribution is obtained by considering each BGpoint as composed by multiple couples ⟨c, f(c)⟩ withoutattempting to force the BG values to fit in a predefined

distribution.

3.2 Foreground Computation

The binary FG image fgmask is computed according

to a thresholding mechanism shown in Alg. 3, where|B(i, j)| denotes the number of couples in B(i, j) with|B(i, j) = ⊘| = 0. A pixel p in position (i, j) is consid-

ered a foreground point if dist(c,B(i, j)) > A), wherec is the color value of p and B(i, j) is the backgroundvalue for the position (i, j).

The use of a set of couples instead of a single BGvalue makes IMBS robust with respect to the choice ofthe parameter A, since a pixel that presents a variation

in the color values larger than A will be modeled by aset of contiguous couples.

IMBS requires a time R = NP for creating the first

background model; then a new BG model, independentfrom the previous one, is built continuously according tothe same refresh time R. The independence of each BG

model is a key aspect of the algorithm, since it 1) allows

Algorithm 4: Model Update

Input: Sk, fgmask

Output: B, F

foreach color value v of pixel p ∈ Sk doforeach couple T := ⟨c, f(c)⟩ ∈ B(i, j) do

if fgmask(i, j) = 1 ∧ |v − c| ≤ A thenT is a foreground couple

to adapt to fast changing environments avoiding errorpropagation and 2) does not affect the accuracy for slowchanging ones. Thanks to the on-line model creation

mechanism, there is no need to store in memory the Nscene samples, that is the main drawback of the non-recursive BS techniques [24].

An Illumination Controller manages the sudden il-lumination changes: N and P values are reduced if thepercentage of foreground pixels in fgmask is above a

certain threshold (e.g., 50 %), in order to speed up thecreation of a new BG model. To further increase thespeed of the algorithm, the foreground extraction can

be paralleled, dividing the images in multiple horizontalslices.

From the analysis of several sequences, we found ex-

perimentally that good values for the IMBS parametersare: A = 5, N = 30, D = 2, and P = 500 ms. Usingthese parameters, it is possible to have a new BG model

every 15 seconds, in which only color values observedat least twice are considered as BG values.

3.3 Model Update

Elgammal et al. in [11] proposed two alternative strate-

gies to update the background. In selective update, onlypixels classified as belonging to the background are up-dated, while in blind update every pixel in the back-

ground model is updated. The selective (or conditional)update improves the detection of the targets since fore-ground information are not added to the BG model,

thus solving the problem of ghost observations [8]. How-ever, when using selective updating any incorrect pixelclassification produces a persistent error, since the BG

model will never adapt to it.Blind update does not suffer from this problem since

no update decisions are taken, but it has the disadvan-

tage that values not belonging to the background areadded to the model.

We propose a different solution, aiming at solving

the problems of both selective and blind update. Asshown in Alg. 4, given a scene sample Sk and the cur-rent foreground mask fgmask, if fgmask(i, j) = 1 and

Sk(i, j) is associated to a couple T in the BG model


Fig. 5 Model update. (a) A vaporetto enters the monitoredscene and remains in the same position over several frames.(b) A blind update obtained by using the algorithm from [13]implemented in OpenCV. The model includes incorrectly thevaporetto as part of the scene background. (c) IMBS modelupdate. The vaporetto is identified as a potential foregroundregion (grey pixels).

under development, then T is labeled as a “foregroundcouple”. When computing the foreground, if I(i, j) isassociated with a foreground couple, then it is classi-

fied as a potential foreground point.

Such a solution allows for identifying regions of the

scene representing not moving foreground objects, as inFig. 5, where a target that remains in the same posi-tion over several frames is detected as a potential fore-

ground. The decision about including or not the po-tential foreground points as part of the background istaken on the basis of a persistence map. If a pixel is clas-

sified as potential foreground consecutively for a periodof time longer than a predefined value (e.g., R/2), thenit becomes part of the BG model.

Furthermore, the labeling process can provide addi-

tional information to higher level modules (e.g., a vi-sual tracking module) helping in reducing ghost obser-vations.

4 Noise Suppression

When BS is adopted to compute the foreground mask,

the moving pixels representing shadows can be detectedas foreground. In order to deal with the erroneouslyclassified foreground pixels, that can deform the de-

tected object shape, a noise suppression module is re-quired. In the maritime domain, along with the noisegenerated boats’ shadows (Fig. 6), also reflections on

the water surface can affect the foreground image pro-ducing false positive detections (Fig. 7). In the follow-ing, two possible solutions for dealing with shadows and

reflections are described.

Fig. 6 Shadow suppression example. Original frame (left),foreground extraction without shadow removal (center), andwith shadow suppression (right). The shadows in the red boxare correctly removed.

Fig. 7 Reflection suppression example. Original frame (left),foreground extraction without reflection removal (center),and with reflection suppression (right).

4.1 Shadows

To detect and suppress shadows, IMBS adopts a strat-egy that is a slight modification of the method proposed

by Cucchiara et al. in [8].

Let Ic(i, j), c = {H,S, V } be the HSV color valuesfor the pixel (i, j) of the input frame and Bc

T (i, j) the

HSV values for the couple T ∈ B(i, j). The shadowmask M value for each foreground point is:

M(i, j) =

1 if ∃ T : α ≤ IV (i,j)

BVT (i,j)

≤ β ∧|IS(i, j)−BS

T (i, j)| ≤ τS ∧|IH(i, j)−BH

T (i, j)| ≤ τH0 otherwise

The parameters α, β, τS , τH are user defined and canbe found experimentally. We analyzed the sequencesfrom the ATON database [2], in order to find the com-

bination of parameters minimizing the error caused byshadows (see Fig. 6). In our implementation the HSVvalues are normalized to [0, 255] and we set α = 0.65,

β = 1.15, τH = 10, and τS = 20. A value for β slightlygreater than 1 allows to filter out light reflections.

Shadow suppression is essential for increasing theaccuracy of the algorithm. The HSV analysis can ef-

fectively remove the errors introduced by a dynamicbackground, since it is a more stable color space withrespect to RGB [41].

After the removal of shadow pixels, a Ffiltered bi-nary image is obtained (see the scheme in Fig. 4). It canbe further refined by exploiting the opening and clos-

ing morphological operators. The former is particularlyuseful for filtering out the noise left by the shadow sup-pression process, the latter is used to fill internal holes

and small gaps.


4.2 Reflections

In order to characterize reflections, a second analysis

based on the HSV color space can be carried out, thistime at a region level (see Fig. 7).

As reported in [15] and [25], the H, S, and V com-ponents in a reflective surface assume specific valuesthrough which it is possible to model, detect, and de-

scribe these particular surfaces. Hue specifies the basecolor, while the other two values, namely saturation andbrightness, specify intensity and lightness of the color.

For a reflective surface, the hue component assumessimilar values of subtractive colors (e.g., cyan, yellow,

and magenta), that are more likely to reflect sun lightsthan dark primary colors (e.g., red, green, and blue).Instead, saturation and brightness respectively assume

low and high values in the range [0, 1].

However, by using the above information it is impos-

sible to characterize the behavior of a single reflection.Indeed, the saturation does not assume low values, butit has the same behavior of the brightness. Therefore,

the analysis can be limited to the brightness compo-nent. Starting from the RGB values of the video se-quence in input, an HSV conversion is carried out to

overcome the information loss due to the image com-pression. The conversion is computed according to [13]:

h =

60×(

g − b

iMax− iMin

)mod 360 if iMax = r

60×(2 +

b− r

iMax− iMin

)mod 360 if iMax = g

60×(4 +

r − b

iMax− iMin

)mod 360 if iMax = b

s =

iMax− iMin

iMax+ iMinl < 0.5

iMax− iMin

2− iMax− iMinl ≥ 0.5

v =iMax− iMin

2

where:

iMax = max(r, g, b), iMin = min(r, g, b), 0 ≤ r, g, b ≤ 1.

IMBS makes a region-level analysis of the image

parts detected as foreground considering each FG blobas a region. Every blob is analyzed and a percentagebased on the evaluation of the brightness of each of its

points is calculated, in order to establish if the blob hasbeen generated by a false detection due to reflections:

brightness(i) > θ.

If the majority of the points of the blob under analysis

has a level of brightness greater than ω, then the regionis classified as an area with reflections:

region brightness percentage > ω.

Fig. 8 IMBS output for frames 1516 and 1545 of the watersurface sequence [35].

Fig. 9 IMBS results on the jug sequence [17]. a) Frame num-ber 36. b) Foreground mask obtained by using the approachpresented in [43]. c) Results obtained in [9]. d) IMBS results.

The parameters θ and ω are defined by the user. In our

experiments we set θ = 0.7 and ω = 60%.

5 Experimental Results

In this paper, we focused on the maritime domain, thus

the experimental results reported in this section areentirely related to water background. IMBS has beentested also with general background scenarios: a de-

tailed analysis of IMBS results on publicly available in-door and outdoor scenes not involving water can befound in [3].

Qualitative results obtained by IMBS on two well-known publicly available sequences involving water back-

ground, namely

(i) Water surface [35]: A person walks in front of awater surface with moving waves (Fig. 8).

(ii) Jug [17]: A foreground jug floats through the back-

ground rippling water (Fig. 9).

demonstrate the capacity of IMBS to correctly modelthe background in both the situations, extracting theforeground masks with great accuracy.

To quantitatively test IMBS on water scenarios, weused both benchmark data from changedetection.net [5]

and real data coming from a video surveillance systemfor the control of naval traffic in Venice [4]. The realdata are available from the MAR - Maritime Activity

Recognition data set [19].


It is worth noting that IMBS parameters have been

set to the same values reported in Section 3 and Section4 for all the experiments presented in this paper.

5.1 Evaluation on MAR data set

Some of the video sequences available in the MAR data

set have been used to quantitatively compare IMBSwith other BS algorithms implemented in the widelyused OpenCV library [23]: MOG [18], MOG2 [44], GMG

[13]. The scenes used for the comparison present vary-ing light conditions and different camera heights andpositions and ten frames containing challenging situa-

tions have been selectively chosen as ground truth.

Qualitative results for the compared methods are

reported in Fig. 10, Table 1 reports quantitative resultsin terms of false negative (the number of foregroundpoints detected as background, FN) and false positive

(the number of background points detected as fore-ground, FP) detections. IMBS has been analyzed bothwith (column IMBS*) and without (column IMBS) ac-

tivating the reflection suppression mechanism. The re-sults show the effectiveness of the IMBS approach, thatperforms better than the other methods in terms of to-

tal error.

To further validate our approach, three additional

metrics have been used, namely F-measure, DetectionRate, and False Alarm Rate.

F-measure is computed as shown in [32]:

F-measure =1

n

∑ni=1 2

Preci ×ReciPreci +Reci

where:

Reci (P ) = TPi/ (TPi + FNi)

Preci (P ) = TPi/ (TPi + FPi)

Reci (N) = TNi/ (TNi + FPi)

Preci (N) = TNi/ (TNi + FNi)

Reci = (1/2) (Reci (P ) +Reci (N))

Preci = (1/2) (Preci (P ) + Preci (N))

Detection Rate (DR) and False Alarm Rate (FAR)

have been computed as follows:

DR =TP

TP + FN; FAR =

FP

TP + FP

Tab. 2 shows the results obtained by calculating F-measure,DR, and FAR for the ten ground truth images

shown in Fig. 10.

Table 5 IMBS computational speed in terms of frame perseconds with respect to different frame sizes.

Data set Frame Size FPSWallflower 160× 120 115ATON 320× 240 70MAR 320× 240 38.6MAR 640× 480 22.4MAR 1024× 768 12MAR 1280× 960 9.4

5.2 Evaluation on changedetection.net data set

Changedetection.net [5,14] is a benchmark data set con-taining several video sequences that can be used toquantitatively evaluate BS methods. To test our algo-

rithm and to compare it with the methods implementedin OpenCV, we used five sequences containing waterbackground in the category “Dynamic Background”.

The results obtained by using two selectively chosenground truth images for each sequence are reportedin Fig. 11. As shown in Tab. 3 and Tab. 4, IMBS

and IMBS* have the same performance on the five se-quences, since no reflections are present in the data.This is one of the reasons that motivates the creation

of the MAR data set has been created: providing verychallenging scenarios from real systems.

5.3 Computational Speed

The computational speed of the algorithm has been

tested by using different frame sizes for the input im-ages. The results (see Table 5) show the real-time per-formance of the proposed approach even for 640× 480

images. The tests have been carried out on a 8x IntelCore i7-3770 3.40GHz CPU with 16GB RAM by usinga single threaded version of IMBS.

6 Conclusions

In this paper, a fast and robust background subtrac-tion algorithm has been described. The approach, calledIMBS, is specifically conceived for being effective in

maritime scenarios.Thanks to an on-line clustering algorithm to cre-

ate the model and a novel conditional update mecha-

nism, IMBS can achieve good accuracy while maintain-ing real-time performance.

A noise suppression module, that can deal with re-

flections on the water surface, allows to increase theaccuracy of the foreground detection.

The method has been quantitatively compared with

other BS methods implemented in the OpenCV library


Fig. 10 IMBS qualitative comparison with OpenCV BS algorithms. The first two columns illustrate the analyzed frame andthe ground truth, respectively. The remaining columns show the foreground obtaining by using OpenCV algorithms and IMBSmethod (last two columns).

by using publicly available video sequences from theMAR - Maritime Activity Recognition data set [19] and

from the changedetection.net data set [5]. The resultsof the comparison demonstrate the effectiveness of theproposed approach.

References

1. Ablavsky, V.: Background models for tracking objects inwater. In: ICIP (3), pp. 125–128 (2003)

2. ATON: Autonomous Agents for On-Scene NetworkedIncident Management. http://cvrr.ucsd.edu/aton/

testbed3. Bloisi, D., Iocchi, L.: Independent multimodal back-

ground subtraction. In: Proc. of the Third Int. Conf.

on Computational Modeling of Objects Presented in Im-ages: Fundamentals, Methods and Applications, pp. 39–44 (2012)

4. Bloisi, D.D., Iocchi, L.: ARGOS - a video surveillancesystem for boat trafic monitoring in venice. InternationalJournal of Pattern Recognition and Artificial Intelligence23(7), 1477–1502 (2009)

5. changedetection.net: Benchmark dataset. http://www.

changedetection.net/

6. Cheung, S., Kamath, C.: Robust techniques for back-ground subtraction in urban traffic video. In: VisualComm. and Image Proc., vol. 5308, pp. 881–892 (2004)

7. Cristani, M., Farenzena, M., Bloisi, D., Murino, V.: Back-ground subtraction for automated multisensor surveil-lance: A comprehensive review. EURASIP J. Adv. Sig.Proc. pp. 1–24 (2010)


Table 1 A comparison of IBMS with OpenCV methods on MAR data set.

Seq. Error MOG MOG2 GMG IMBS IMBS*

1f. neg. 14418 6106 1152 3625 5061f.pos. 15595 514082 108899 46272 30058tot. e. 30013 520188 110051 49897 35119

2f. neg. 80959 11910 22252 9077 9604f.pos. 7432 141628 43820 25259 23306tot. e. 88391 153538 66072 34336 32910

3f. neg. 29659 4761 13416 8526 11161f.pos. 5153 207511 35204 34219 44587tot. e. 34812 212272 48620 42745 55748

4f. neg. 45634 9759 17148 9848 10153f.pos. 9339 151342 40220 34301 29516tot. e. 54973 161101 57368 44149 39669

5f. neg. 72158 9946 45873 25162 21567f.pos. 5730 151103 43661 44640 21131tot. e. 77888 161049 89534 69802 42698

6f. neg. 46741 7197 26856 7265 10311f.pos. 1825 153352 13102 13005 10020tot. e. 48566 160549 39958 20270 20331

7f. neg. 35130 9225 10574 7682 7279f.pos. 25394 361048 76184 53041 24729tot. e. 60524 370273 86758 60723 32367

8f. neg. 80959 11910 22252 9077 7638f.pos. 7432 141628 43820 25259 24729tot. e. 88391 153538 66072 34336 32367

9f. neg. 16384 1011 3568 1135 1060f.pos. 5677 182025 37908 24989 20676tot. e. 22061 183036 41476 26124 21736

10f. neg. 16707 10337 15499 1807 1737f.pos. 666 97782 9298 11831 11549tot. e. 17373 108119 24797 13638 13286

Tot. Err. 522.992 2.183.663 630.706 396.020 348.985

Table 2 Computation of F-measure, Detection Rate and False Alarm Rate on MAR data set.

MOG MOG2 GMG IMBS IMBS*F-measure 0.67 0.65 0.70 0.86 0.86

DR 0.30 0.70 0.60 0.54 0.53FAR 0.10 0.45 0.23 0.14 0.14

8. Cucchiara, R., Grana, C., Piccardi, M., Prati, A.: Detect-ing moving objects, ghosts, and shadows in video streams.PAMI 25(10), 1337–1342 (2003)

9. Dalley, G., Migdal, J., Grimson, W.: Background subtrac-tion for temporally irregular dynamic textures. In: IEEEWorkshop on Applications of Computer Vision, pp. 1–7(2008)

10. Doretto, G., Chiuso, A., Wu, Y.N., Soatto, S.: Dynamictextures. IJCV 51(2), 91–109 (2003)

11. Elgammal, A.M., Harwood, D., Davis, L.S.: Non-parametric model for background subtraction. In: ECCV,pp. 751–767 (2000)

12. Elhabian, S.Y., El-Sayed, K.M., Ahmed, S.H.: Movingobject detection in spatial domain using background re-moval techniques - state-of-art. Recent Patents on Com-puter Science 1, 32–54 (2008)

13. Godbehere, A.B., Matsukawa, A., Goldberg, K.: Visualtracking of human visitors under variable-lighting condi-tions for a responsive audio art installation. AmericanControl Conference (ACC) pp. 4305–4312 (2012)

14. Goyette, N., Jodoin, P.M., Porikli, F., Konrad, J., Ishwar,P.: changedetection.net: A new change detection bench-mark dataset. In: Proc. IEEE Workshop on Change De-tection at CVPR12 (2012)

15. He, Q., Chu, C.H.H.: Detection of reflecting surfaces bya statistical model. In: SPIE (2009)

16. Heikkila, M., Pietikainen, M.: A texture-based methodfor modeling the background and detecting moving ob-jects. PAMI 28(4), 657–662 (2006)

17. Jug Sequence: Dynamic background sequences. http:

//www.cs.bu.edu/groups/ivc/data.php

18. Kaewtrakulpong, P., Bowden, R.: An improved adap-tive background mixture model for realtime tracking withshadow detection. In: Proc. 2nd European Workshop onAdvanced Video Based Surveillance Systems, pp. 135–144 (2001)

19. MAR: Maritime Activity Recognition Dataset. http://

labrococo.dis.uniroma1.it/MAR

20. Mittal, A., Paragios, N.: Motion-based background sub-traction using adaptive kernel density estimation. In:CVPR, pp. 302–309 (2004)


Fig. 11 Results on changedetection.net data set [5]. The first two columns illustrate the analyzed frame and the ground truth,respectively. The remaining columns show the foreground obtaining by using OpenCV algorithms and IMBS method (last twocolumns).

21. Noriega, P., Bernier, O.: Real time illumination invariantbackground subtraction using local kernel histograms. In:Proc. BMVC, pp. 100.1–100.10 (2006)

22. Oliver, N.M., Rosario, B., Pentland, A.P.: A bayesiancomputer vision system for modeling human interactions.PAMI 22(8), 831–843 (2000)

23. OpenCV: Open Source Computer Vision. http://

opencv.org24. Piccardi, M.: Background subtraction techniques: a re-

view. In: Proc. of the IEEE Int. Conf. on Systems, Man& Cybernetics, pp. 3099–3104 (2004)

25. Rankin A. L. Matthies, L.H., Huertas, A.: Daytime waterdetection by fusing multiple cues for autonomous off-roadnavigation. In: 24th Army Science Conference, 9 (2004)

26. Sheikh, Y., Shah, M.: Bayesian object detection in dy-namic scenes. In: CVPR, pp. 74–79 (2005)

27. Stauffer, C., Grimson, W.: Adaptive background mixturemodels for real-time tracking. CVPR 2, 246–252 (1999)

28. Stenger, B., Ramesh, V., Paragios, N., Coetzee, F., Buh-mann, J.: Topology free hidden markov models: appli-cation to background modeling. In: ICCV, vol. 1, pp.294–301 (2001)

29. Tavakkoli A. Nicolescu, M., Bebis, G.: Robust recursivelearning for foreground region detection in videos withquasi-stationary backgrounds. In: ICPR, pp. 315–318(2006)

30. Tian, Y., Feris, R.S., Liu, H., Hampapur, A., Sun, M.T.:Robust detection of abandoned and removed objects incomplex surveillance videos. IEEE Transactions on Sys-tems, Man, and Cybernetics, Part C 41(5), 565–576(2011)

31. Toyama, K., Krumm, J., Brumitt, B., Meyers, B.:Wallflower: principles and practice of background main-tenance. In: ICCV, vol. 1, pp. 255–261 (1999)


Table 3 A comparison of IBMS with OpenCV methods on changedetection.net data set [5].

Seq. Error MOG MOG2 GMG IMBS IMBS*

1f. neg. 3531 554 1364 2251 2251f.pos. 978 18514 12672 38 38tot. e. 4509 19068 14036 2289 2289

2f. neg. 6286 716 2005 1420 1420f.pos. 523 33961 17567 325 325tot. e. 6809 34677 19572 1745 1745

3f. neg. 935 230 715 898 898f.pos. 1026 13474 770 40 40tot. e. 1961 13704 1485 938 938

4f. neg. 1527 509 829 818 818f.pos. 685 13853 713 0 0tot. e. 2212 14362 1542 818 818

5f. neg. 493 148 204 527 527f.pos. 742 22945 4127 60 60tot. e. 1235 23093 4331 587 587

6f. neg. 721 185 242 743 743f.pos. 1480 29824 6107 1274 1274tot. e. 2201 30009 6349 2017 2017

7f. neg. 1271 219 225 800 800f.pos. 146 19630 1457 131 131tot. e. 1417 19849 1682 931 931

8f. neg. 922 164 60 502 502f.pos. 144 22699 854 95 95tot. e. 1066 22863 914 597 597

9f. neg. 23 5 12 37 37f.pos. 374 475 11674 116 116tot. e. 397 480 11686 153 153

10f. neg. 3188 2712 2922 2730 2730f.pos. 270 11747 283 1676 1676tot. e. 3458 14459 3205 4406 4406

Tot. Err. 25.265 192.564 64.802 14.481 14.481

Table 4 Computation of F-measure, Detection Rate and False Alarm Rate on changedetection.net data set.

MOG MOG2 GMG IMBS IMBS*F-measure 0.70 0.63 0.73 0.77 0.77

DR 0.34 0.78 0.67 0.46 0.46FAR 0.06 0.20 0.06 0.05 0.05

32. Vacavant, A., Chateau, T., Wilhelm, A., Lequivre, L.: Abenchmark dataset for outdoor foreground/backgroundextraction. In ACCV 2012, Workshop: Background Mod-els Challenge. (2012)

33. Wallflower Sequence: Test Images for WallflowerPaper. http://research.microsoft.com/en-us/um/

people/jckrumm/wallflower/testimages.htm34. Wang, H., Suter, D.: Background subtraction based on a

robust consensus method. In: ICPR, pp. 223–226 (2006)35. Water surface sequence: Statistical Modeling of Com-

plex Background for Foreground Object Detection.http://perception.i2r.a-star.edu.sg/bk%5Fmodel/

bk%5Findex.html36. Zhang, S., Yao, H., Liu, S.: Dynamic background mod-

eling and subtraction using spatio-temporal local binarypatterns. In: ICIP, pp. 1556–1559 (2008)

37. Zhang, S., Yao, H., Liu, S.: Dynamic background sub-traction based on local dependency histogram. IJPRAI23(7), 1397–1419 (2009)

38. Zhang, S., Yao, H., Liu, S.: Spatial-temporal nonpara-metric background subtraction in dynamic scenes. In:

ICME, pp. 518–521 (2009)39. Zhang, S., Yao, H., Liu, S., Chen, X., Gao, W.: A

covariance-based method for dynamic background sub-traction. In: ICPR, pp. 1–4 (2008)

40. Zhao, J., Xu, X., Ding, X.: New goodness of fit testsbased on stochastic edf. Commun. Stat., Theory Methods39(6), 1075–1094 (2010)

41. Zhao, M., Bu, J., Chen, C.: Robust background subtrac-tion in hsv color space. In: SPIE: Multimedia Systemsand Applications, pp. 325–332 (2002)

42. Zhong, B., Yao, H., Shan, S., Chen, X., Gao, W.: Hierar-chical background subtraction using local pixel cluster-ing. In: ICPR, pp. 1–4 (2008)

43. Zhong, J., Sclaroff, S.: Segmenting foreground objectsfrom a dynamic textured background via a robust kalmanfilter. In: ICCV, pp. 44–50 (2003)

44. Zivkovic, Z.: Improved adaptive gaussian mixture modelfor background subtraction. In: International Conferenceon Pattern Recognition, vol. 2, pp. 28–31 (2004)


Domenico Daniele Bloisi is As-sistant Professor at Sapienza Uni-

versity of Rome, Italy. He receivedhis PhD, M.Sc., and B.Sc. degrees inComputer Engineering from Sapien-

za University of Rome in 2010, 2006and 2004, respectively. His main re-

search interests are related to intelligent surveillance

(including object detection, visual tracking, and mul-tiple sensor data fusion) and robotics. He is a facultymember of RoCoCo (Cognitive Cooperating Robots)

laboratory.

Andrea Pennisi is Ph.D. studentat Sapienza University of Rome, Ita-ly. He received his M.Sc. in Com-

puter Engineering from Sapienza Uni-versity of Rome in 2010 and his B.Sc.in Computer Engineering from Uni-

versity of Catania in 2006. His main

research interests are related to multi-sensors surveil-lance, crowd analysis, and image processing. He is amember of RoCoCo (Cognitive Cooperating Robots)

laboratory.

Luca Iocchi is Associate Professor

at Sapienza University of Rome, Ita-ly. His main research interests are inthe areas of cognitive robotics, ac-

tion planning, multi-robot coordina-tion, robotic perception, robot learn-

ing, sensor data fusion. He is being involved in several

projects aiming at developing intelligent robotic sys-tems and intelligent surveillance systems. He is activein many conferences and journals related to artificialintelligence and robotics, as well as in the organization

of scientific competitions, such as RoboCup@Home.

background modeling in the maritime domainbloisi/papers/background... · e-mail:...

Documents