cs 376b introduction to computer vision 03 / 31 / 2008 instructor: michael eckmann

22
CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Upload: thomasina-spencer

Post on 18-Jan-2018

218 views

Category:

Documents


0 download

DESCRIPTION

Michael Eckmann - Skidmore College - CS 376b - Spring 2008 Motion Detect/describe motion (of an object or objects or of the scene as a whole...) from an image sequence, where each frame is separated by some time t (e.g. a video at 30 fps has t=1/30 sec.). There are several cases that lend themselves to different approaches –a non-moving camera imaging a static scene and we are to detect one moving object –a non-moving camera imaging a static scene and we are to detect multiple moving objects –a moving camera imaging a static scene –a moving camera and moving objects

TRANSCRIPT

Page 1: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

CS 376bIntroduction to Computer Vision

03 / 31 / 2008

Instructor: Michael Eckmann

Page 2: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Today’s Topics• Comments/Questions• Motion

– background subtraction scheme– definitions

Page 3: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• Detect/describe motion (of an object or objects or of the scene

as a whole ...) from an image sequence, where each frame is separated by some time t (e.g. a video at 30 fps has t=1/30 sec.).

• There are several cases that lend themselves to different approaches

– a non-moving camera imaging a static scene and we are to detect one moving object

– a non-moving camera imaging a static scene and we are to detect multiple moving objects

– a moving camera imaging a static scene– a moving camera and moving objects

Page 4: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• a non-moving camera imaging a static scene and we are to

detect one moving object --- can use background subtraction.

• This is from figure 9.1 in Shapiro and Stockman (credit due to S.-W. Chen)

Page 5: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• let's consider an overall algorithm for a real

scene just shown– when a frame is subtracted from another frame will

the result be black (0)?• most likely not, due to various noise, flickering of

fluorescent lights, etc.

Page 6: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• let's consider an overall algorithm for a real

scene just shown– when a frame is subtracted from another frame will

the result be black (0)?• most likely not, due to various noise, flickering of

fluorescent lights, etc.• so we'd probably need to determine a threshold over

which, the subtraction is deemed significant– is it possible that this threshold ends up causing

pixels on the moving object to be not detected as significant? also, can it still detect possibly insignificant pixels too?

Page 7: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• let's consider an overall algorithm for a real

scene just shown– is it possible that this threshold ends up causing

pixels on the moving object to be not detected as significant? also, can it still detect possibly insignificant pixels too?

• yes, so we can perform connected components and remove all the small regions afterwards which are considered due to noise

– After connected components, we might still have holes to fill --- so what could we do?

Page 8: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• let's consider an overall algorithm for a real

scene just shown– After connected components, we might still have

holes to fill --- so what could we do?• close

Page 9: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• algorithm 9.1 from Shapiro and Stockman

Page 10: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• What if you were concerned with just

determining if the scene changed, not where?– any ideas?

Page 11: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• The motion field is defined as: a 2d array of 2d

vectors representing the motion of 3d scene points. Vectors can be points at time t to t+dt.

• The focus of expansion (FOE) is defined as: the point in the image plane from which motion field vectors diverge. (This is usually the point toward which the camera is moving.)

• The focus of contraction (FOC) is defined as: the point in the image plane toward which motion field vectors converge. (This is usually the point from which the camera is moving away.)

Page 12: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion

figure 9.2 from Shapiro and Stockman

Page 13: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• The text describes a video game that detects

the motion field in a video of someone using their hands/arms to either run or jump.

Page 14: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• To compute motion field vectors we'll need to detect and locate interest points with high accuracy.

Page 15: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• To correspond points from one image to the

next ---– take the neighborhood of an interesting point and

look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood.

• How might you compare neighborhoods?

Page 16: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• To correspond points from one image to the

next ---– take the neighborhood of an interesting point and

look in some small window of the next frame (assuming motion is limited to some speed/distance) and find the best matching neighborhood.

• How might you compare neighborhoods?• Can use cross-correlation of the neighborhoods and the

largest value is best match.• Also smallest SSD can be used as a match value (or L1

or L2 distance too.)

Page 17: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• To compute motion field vectors we can detect

– high interest points (have high energy in many directions)

– edges are problematic here (only one direction)– corners (two directions)– anything that can be located accurately in a later

image– centroids of moving regions after segmentation

could be tracked as well.• discussion on the board why corners are more

accurately located than edges.

Page 18: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• The text describes an interest operator to detect

high interest points – the smallest variance in the vertical, horizontal, and

2 diagonal directions in a neighborhood must be above some threshold

– how well will this be able to be located?

Page 19: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion

• from algorithm 9.2 in Shapiro and Stockman

Page 20: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• Let's consider some places where this kind of

detection and matching scheme might break down.

– What assumptions need to be true to make the scheme work?

Page 21: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion• MPEG

– to compress video (an image sequence)– replaces a 16x16 image block with a motion vector

describing the motion of that block• so in a later frame, a 16x16 block is represented as a

vector– only the vector is stored if the blocks are identical

(or very close)• if they differ by too much, encode the difference

• These 16x16 blocks and motion vectors are computed between say a frame fi and fi+3

Page 22: CS 376b Introduction to Computer Vision 03 / 31 / 2008 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 376b - Spring 2008

Motion

• from figure 9.7 Shapiro and Stockman