background subtraction
DESCRIPTION
Background Subtraction. Purpose of Background Subtraction. Reduce problem set for further processing Only process part of picture that contains the relevant information Segment the image into foreground and background Add a virtual background. Encountered Problems. Lighting Shadows - PowerPoint PPT PresentationTRANSCRIPT
1
Background Subtraction
2
Purpose of Background Subtraction
Reduce problem set for further processing Only process part of picture that contains the
relevant information Segment the image into foreground and
background Add a virtual background
3
Encountered Problems Lighting
Shadows Gradual/Sudden illumination changes
Camouflage
Moving objects Foreground aperture
Foreground object becomes motionless
4
Lighting and Shadows
Weight the luminance with other characteristics Depth of object Region/Frame information
Adjust the background model with time Store a history of previous backgrounds
5
Widely Used
6
Simple Approach
7
Frame Differencing
8
The accuracy of this approach is dependent on speed of movement in the scene. Faster movements may require higher thresholds
Frame Differencing
9
This approach will only work for cases where all foreground pixels are moving and all background pixels are static.
Mean Filter
10
where N is the number of preceding images taken for averaging. This averaging refers to averaging corresponding pixels in the given images.N would depend on the video speed (number of images per second in the video) and the amount of movement in the video.
Mean Filter
11
Median Filter
Nathan Johnson 12
Median Filter
Nathan Johnson 13
Advantages & Shortcomings
14
Advantages & Shortcomings
15
Adaptive Background Mixture Models
16
Normal Gaussian Distribution
17
As a result, a pixel, once it has become foreground, can only become background again when the intensity value gets close to what it was before turning foreground. This method, however, has several issues: It only works if all pixels are initially background pixels (or foreground pixels are annotated as such).
Running Gaussian Average
The pdf of every pixel is characterized by mean and variance.
In order to initialize variance, we can, for example, use the variance in x and y from a small window around each pixel.
Note that background may change over time (e.g. due to illumination changes or non-static background objects). To accommodate for that change, at every frame , every pixel's mean and variance must be updated
.18
Algorithm Overview
19
If a pixel is categorized as foreground for a too long period of time, the background intensity in that location might have changed (because illumination has changed etc.). As a result, once the foreground object is gone, the new background intensity might not be recognized as such anymore.
Mixture Model
20
Mixture Model
Nathan Johnson 21
Model Adaptation
Nathan Johnson 22
Background Model Estimation
23
Background Model Estimation
Nathan Johnson 24
Advantages Vs Shortcomings
25
Relevant papers
Nathan Johnson 26
Summary
27
Selectivity
28
Selectivity
29
Limitations (Selectivity)
Nathan Johnson 30
Kernel Density Estimators
31
Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite samples.Kernel density estimates are closely related to histograms, but can be endowed with properties such as smoothness or continuity by using a suitable kernell.
Mean- Shift Based Estimation
Mean Shift Based Estimation
33
Mean shift is a procedure for locating the maxima of density function from given discrete data sampled density function.
The simplest such algorithm would create a confidence map in the new image based on the color histogram of the object in the previous image, and use mean shift to find the peak of a confidence map near the object's old position.
The confidence map is a probability density function on the new image, assigning each pixel of the new image a probability, which is the probability of the pixel color occurring in the object in the previous image.
Problems & Solutions
34
Eigen Backgrounds
The method uses the difference between input image and the reconstructed background image to detect foreground objects based on eigenvalue decomposition.
Foreground regions are represented in the reconstructed image using eigenbackground.
As the principle components of a scene are its background in general, the eigenbackgrounds that are used to reconstruct the background depict the characteristics of background. 35
Eigen- backgrounds
36
Spatial Correlation
Nathan Johnson 37
Binary Morphology Binary Morphological techniques probe an
image with a small shape or template called a structuring element.
The structuring element is positioned at all possible locations in the image and it is compared with the corresponding neighbourhood of pixels.
Some operations test whether the element "fits" within the neighbourhood, while others test whether it "hits" or intersects the neighbourhood. 38
Summary
Nathan Johnson 39
Summary
Nathan Johnson 40
Definition of Motion Detection
Action of sensing physical movement in a give area
Motion can be detected by measuring change in speed or vector of an object
41
42
Motion DetectionGoals of motion detection • Identify moving objects• Detection of unusual activity patterns• Computing trajectories of moving objects
Applications of motion detection • Indoor/outdoor security• Real time crime detection• Traffic monitoringMany intelligent video analysis systems are based on
motion detection.
43
Two Approaches to Motion Detection Optical Flow
Compute motion within region or the frame as a whole
Change detection Detect objects within a scene Track object across a number of frames
44
Background Subtraction Uses a reference background image for
comparison purposes. Current image (containing target object) is
compared to reference image pixel by pixel. Places where there are differences are detected
and classified as moving objects.
Motivation: simple difference of two images shows moving objects
45
a. Original scene b. Same scene later
Subtraction of scene a from scene b Subtracted image with threshold of 100
46
Static Scene Object Detection and Tracking
Model the background and subtract to obtain object mask
Filter to remove noise Group adjacent pixels to obtain objects Track objects between frames to develop
trajectories
47
Background Modelling
48
Background Model
49
After Background Filtering…
50
Approaches to Background Modeling Background Subtraction
Statistical Methods (e.g., Gaussian Mixture Model, Stauffer and Grimson 2000)
Background Subtraction:1. Construct a background image B as average of few images2. For each actual frame I, classify individual pixels as
foreground if |B-I| > T (threshold)3. Clean noisy pixels
51
Background Subtraction
Background Image Current Image
52
Statistical Methods
Pixel statistics: average and standard deviation of color and gray level values (e.g., W4 by Haritaoglu, Harwood, and Davis 2000)
Gaussian Mixture Model (e.g., Stauffer and Grimson 2000)
53
Proposed ApproachMeasuring Texture Change
Classical approaches to motion detection are based on background subtraction, i.e., a model of background image is computed, e.g., Stauffer and Grimson (2000)
Our approach does not model any background image.
We estimate the speed of texture change.
54
In our system we divide video plane in disjoint blocks(4x4 pixels), and compute motion measure for each block.
mm(x,y,t) for a given block location (x,y) is a function of t
55
8x8 Blocks
56
Block size relative to image size
Block 24x28
1728 blocks per frame
Image Size:36x48 blocks
57
Motion Measure Computation We use spatial-temporal blocks to represent
videos Each block consists of NBLOCK x NBLOCK pixels from
3 consecutive frames Those pixel values are reduced to K principal
components using PCA (Kahrunen-Loeve trans.) In our applications, NBLOCK=4, K=10 Thus, we project 48 gray level values to a texture
vector with 10 PCA components
58
3D Block Projection with PCA (Kahrunen-Loeve trans.)
48-component block vector (4*4*3)
-0.5221 -0.0624 -0.1734 -0.2221 -0.2621 -0.4739 -0.4201 -0.4224 -0.0734 -0.1386
10 principal components
t+1 tt-1
4*4*3 spatial-temporal blockLocation I=24, J=28,time t-1, t, t+1
Motion Measure Computation
59
Texture of spatiotemporal blocks works better than color pixel values
More robust Faster
We illustrate this with texture trajectories.
60
499 624
863 1477
61
Detection of Moving Objects Based on Local Variation
For each block location (x,y) in the video plane• Consider texture vectors in a symmetric window [t-
W, t+W] at time t• Compute the covariance matrix• Motion measure is defined as
the largest eigenvalue of the covariance matrix
Background Subtraction and Matting
Image Stack
As can look at video data as a spatio-temporal volume If camera is stationary, each line through time
corresponds to a single ray in space We can look at how each ray behaves What are interesting things to ask?
t0
255time
Background Subtraction A largely unsolved problem…
Estimatedbackground
Difference Image
ThresholdedForeground on blue
One videoframe
How does Superman fly?
Super-human powers?
OR Image Matting and Compositing?
Compositing Procedure
1. Extract Sprites (e.g using Intelligent Scissors in Photoshop)
2. Blend them into the composite (in the right order)
Multiple Alpha Blending So far we assumed that one image
(background) is opaque. If blending semi-transparent sprites (the “A
over B” operation):
Icomp = aIa + (1-a)bIb
comp = a + (1-a)b
Note: sometimes alpha is premultiplied: im(R,G,B,):
Icomp = Ia + (1-a)Ib
(same for alpha!)
“Pulling a Matte” Problem Definition:
The separation of an image C into A foreground object image Co,
a background image Cb, and an alpha matte
Co and can then be used to composite the foreground object into a different image
Hard problem Even if alpha is binary, this is hard to do automatically
(background subtraction problem) For movies/TV, manual segmentation of each frame is
infeasible Need to make a simplifying assumption…
Blue Screen
Blue Screen matting
Most common form of matting in TV studios & movies
A form of background subtraction: Need a known background Compute alpha as SSD(C,Cb) > threshold
Or use Vlahos’ formula: = 1-p1(B-p2G)
Hope that foreground object doesn’t look like background no blue ties!
Why blue? Why uniform?
The Ultimatte
p1 and p2
Blue screen for superman?
Semi-transparent mattes
What we really want is to obtain a true alpha matte, which involves semi-transparency Alpha between 0 and 1
Matting Problem: Mathematical Definition
Why is general matting hard?
Solution #1: No Blue!
More examples
Removing Shadows (Weiss, 2001)
How does one detect (subtract away) shadows?
Averaging Derivatives
Recovering Shadows
Compositing with Shadows
Outline
• Applications of segmentation to video• Motion and perceptual organization• Motion field• Optical flow• Motion segmentation with layers
Video• A video is a sequence of frames captured
over time• Now our image data is a function of space
(x, y) and time (t)
Applications of segmentation to video
• Background subtraction A static camera is observing a scene Goal: separate the static background from the
moving foreground
Applications of segmentation to video
• Background subtraction Form an initial background estimate For each frame:
Update estimate using a moving average Subtract the background estimate from the frame Label as foreground each pixel where the magnitude
of the difference is greater than some threshold Use median filtering to “clean up” the results
Applications of segmentation to video
• Background subtraction• Shot boundary detection
Commercial video is usually composed of shots or sequences showing the same objects or scene
Goal: segment video into shots for summarization and browsing (each shot can be represented by a single keyframe in a user interface)
Difference from background subtraction: the camera is not necessarily stationary
Motion and perceptual organization
• Sometimes, motion is the only cue
Motion estimation techniques• Direct methods
Directly recover image motion at each pixel from spatio-temporal image brightness variations
Dense motion fields, but sensitive to appearance variations
Suitable for video and when image motion is small
• Feature-based methods Extract visual features (corners, textured areas) and track
them over multiple frames Sparse motion fields, but more robust tracking Suitable when image motion is large (10s of pixels)