cs 376b introduction to computer vision 03 / 31 / 2008 instructor: michael eckmann

CS 376bIntroduction to Computer Vision

03 / 31 / 2008

Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Today’s Topics• Comments/Questions• Motion

– background subtraction scheme– definitions


Motion• Detect/describe motion (of an object or objects or of the scene

as a whole ...) from an image sequence, where each frame is separated by some time t (e.g. a video at 30 fps has t=1/30 sec.).

• There are several cases that lend themselves to different approaches

– a non-moving camera imaging a static scene and we are to detect one moving object

– a non-moving camera imaging a static scene and we are to detect multiple moving objects

– a moving camera imaging a static scene– a moving camera and moving objects


Motion• a non-moving camera imaging a static scene and we are to

detect one moving object --- can use background subtraction.

• This is from figure 9.1 in Shapiro and Stockman (credit due to S.-W. Chen)


Motion• let's consider an overall algorithm for a real

scene just shown– when a frame is subtracted from another frame will

the result be black (0)?• most likely not, due to various noise, flickering of

fluorescent lights, etc.



scene just shown– when a frame is subtracted from another frame will

the result be black (0)?• most likely not, due to various noise, flickering of

fluorescent lights, etc.• so we'd probably need to determine a threshold over

which, the subtraction is deemed significant– is it possible that this threshold ends up causing

pixels on the moving object to be not detected as significant? also, can it still detect possibly insignificant pixels too?



scene just shown– is it possible that this threshold ends up causing

pixels on the moving object to be not detected as significant? also, can it still detect possibly insignificant pixels too?

• yes, so we can perform connected components and remove all the small regions afterwards which are considered due to noise

– After connected components, we might still have holes to fill --- so what could we do?



scene just shown– After connected components, we might still have

holes to fill --- so what could we do?• close


Motion• algorithm 9.1 from Shapiro and Stockman


Motion• What if you were concerned with just

determining if the scene changed, not where?– any ideas?


Motion• The motion field is defined as: a 2d array of 2d

vectors representing the motion of 3d scene points. Vectors can be points at time t to t+dt.

• The focus of expansion (FOE) is defined as: the point in the image plane from which motion field vectors diverge. (This is usually the point toward which the camera is moving.)

• The focus of contraction (FOC) is defined as: the point in the image plane toward which motion field vectors converge. (This is usually the point from which the camera is moving away.)


Motion

figure 9.2 from Shapiro and Stockman


Motion• The text describes a video game that detects

the motion field in a video of someone using their hands/arms to either run or jump.


Motion• To compute motion field vectors we'll need to detect and locate interest points with high accuracy.


Motion• To correspond points from one image to the

next ---– take the neighborhood of an interesting point and

look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood.

• How might you compare neighborhoods?


Motion• To correspond points from one image to the

next ---– take the neighborhood of an interesting point and

look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood.

• How might you compare neighborhoods?• Can use cross-correlation of the neighborhoods and the

largest value is best match.• Also smallest SSD can be used as a match value (or L1

or L2 distance too.)


Motion• To compute motion field vectors we can detect

– high interest points (have high energy in many directions)

– edges are problematic here (only one direction)– corners (two directions)– anything that can be located accurately in a later

image– centroids of moving regions after segmentation

could be tracked as well.• discussion on the board why corners are more

accurately located than edges.


Motion• The text describes an interest operator to detect

high interest points – the smallest variance in the vertical, horizontal, and

2 diagonal directions in a neighborhood must be above some threshold

– how well will this be able to be located?


Motion

• from algorithm 9.2 in Shapiro and Stockman


Motion• Let's consider some places where this kind of

detection and matching scheme might break down.

– What assumptions need to be true to make the scheme work?


Motion• MPEG

– to compress video (an image sequence)– replaces a 16x16 image block with a motion vector

describing the motion of that block• so in a later frame, a 16x16 block is represented as a

vector– only the vector is stored if the blocks are identical

(or very close)• if they differ by too much, encode the difference

• These 16x16 blocks and motion vectors are computed between say a frame fi and fi+3


Motion

• from figure 9.7 Shapiro and Stockman

cs 376b introduction to computer vision 03 / 31 / 2008 instructor: michael eckmann

Documents