low-level motion activity features for semantic characterization of video kadir a. peker, a. aydin...

16
Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia a nd Expo 2000

Post on 22-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Low-level Motion Activity Features for Semantic Characterization of Video

Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu

International Conference on Multimedia and Expo 2000

Page 2: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Introduction We want to establish connections

between low-level motion activity feature of video segments and the semantic meaningful characterization of them.

Two computationally simple descriptors for motion activity of a video content is used.

Page 3: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Motion Activity Descriptors act0 : monotonous (steady) motion

activity descriptor act1 : non-monotonous (unsteady)

motion activity descriptor

Page 4: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia
Page 5: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

j

Page 6: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

act0 is sensitive to global motion such as camera pan and to objects moving very close to camera.

act1 filters out the component of motion activity that does not change from frame to frame.

In contrast to act0 , act1 is more sensitive to unsteady motion such as fickle motion of a non-rigid object in close up.

Page 7: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Results from Application Examples We use the two descriptors in two

different application contexts. Browsing through a sports video. Retrieval from a database of shots.

Page 8: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Detecting Close-ups in Sports Video

We observe that the difference act1(n)- act0(n) is highest for close-up shots where the irregular motion of players in view is dominant over the regular global motion.

Basketball from MPEG-7 data set (10 minutes, 18000 frames ,4800 P frames)

A ground truth data is prepared manually, segmenting the video into wide angle and close-up shots.(59 segments, 30 being close-ups)

Page 9: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

We expect m1 (act0(n)) to be high for close-up frames because zoom or when the action is close to the camera the motion vector is larger.

We expect if non-monotonous activity act1(n) is significantly higher than act0(n) in a frame, then with a high probability, the frame is a close-up on a highly active object.

Page 10: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Frame-based detection Two threshold for m1 and m2 to select 250 P-

frames. Bounding boxes are close-up segments. Positive impulses are where m2 suggest a

close-up. Negative impulses are where m1 suggest a

close-up.

Page 11: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Segment-based Detection

Page 12: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

We find the close-up segments by sorting the segments with respect to sm1 and sm2 and choosing the top K.

The retrieval using sm1 (average of act0 over the segment) is misled by camera motion.The first retrieved segment is a fast pan segment.

We find sm2 to be a more reliable detector for close-ups.

Page 13: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia
Page 14: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Retrieval of High Activity Shots A database of 600 shots from MPEG-7 test

set, include various programs such as news, sports,entertainment, education, etc.

5 highest activity shots are retrieved using act0, act1 and (act1- act0).

act0 and act1 retrieve shots that contain fast camera motions or an objects that passes too close to the camera, which are not commonly considered high activity.

(act1- act0) get 5 shots of dancing people.

Page 15: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia
Page 16: Low-level Motion Activity Features for Semantic Characterization of Video Kadir A. Peker, A. Aydin Alatan, Ali N. Akansu International Conference on Multimedia

Conclusion We described two descriptors to infer

whether the activity content is dominantly a monotonous, steady motion or an unsteady, inconstant motion.

This kind of a characterization of the activity content can be used to detect close-up segment in a sports video or in an activity based query from a database of video shots.