![Page 1: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/1.jpg)
Automatic Caption Localization in Compressed Video
By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow, IEEE
IEEE Transactions on Pattern Analysis and Machine Intelligence
Vol 22, No. 4, April 2000
![Page 2: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/2.jpg)
Introduction
Caption text on videoGeneral methods for caption extractionProposed Method How it works Evaluation
![Page 3: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/3.jpg)
Caption Text on Video
Parse, index and abstract of Video
Caption Text Information of Video Describe the content Catch “highlights”
![Page 4: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/4.jpg)
General Extraction Methods
Component-based Geometrical arrangement Homogeneous color
Texture-based Contrast the background Horizontal intensity variation
![Page 5: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/5.jpg)
Most published method Applied on uncompressed images
Digital video and images Compressed (MPEG & JPEG) DCT (Discrete Cosine Transform) coding Reducing interframe redundancy (for MPEG)
![Page 6: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/6.jpg)
Proposed Method
Step 1 & 2 Detecting Blocks
Step 3 Refinement
Step 4 Segmentation of
rows
Step 1
Step 2
Step 3
Step 4
![Page 7: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/7.jpg)
Proposed Method
Source frame
![Page 8: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/8.jpg)
Step 1 & 2Detecting Blocks of High Horizontal Spatial Intensity Variation
Operates in DCT domain Not necessary to decompress Unit: 8x8 blocks in I-frames
(Intracoded)
Quantized DCT coefficients Readily extracted Fast
DCT blocks with high horizontal intensity variation
![Page 9: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/9.jpg)
Step 3Remove noise by applying Morphological Operations
Step 1 & 2 Picked high contrast nontext
blocks Disconnected text blocks
Wide spacing, low contrast, large fonts
Step 3 Remove most isolated blocks Merges nearby blocks
Applying Morphological Operations
![Page 10: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/10.jpg)
Step 4Segmentation based on vertical intensity variation
Detected text regions Large vertical intensity
variation Local vertical harmonics
Corresponding row of text High vertical spectrum
energyAfter horizontal/vertical text energy test
![Page 11: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/11.jpg)
Dilating the previous result by one block
![Page 12: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/12.jpg)
Evaluation
Not work properly when: Very big characters Too widely spaced text Image texture
![Page 13: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/13.jpg)
Caption Text on Video
Parse, index and abstract of Video
Caption Text Information of Video Describe the content Catch “highlights”
![Page 14: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/14.jpg)
Evaluation
Commonly used caption NOT very big characters NOT too widely spaced text NOT image texture
Therefore, important information retrieved!
![Page 15: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/15.jpg)
Evaluation
Future work Proposed to other transform-based
compressions Use also color information to improve accuracy Combining DCT blocks to support larger fonts Solution to P- and B-frames
![Page 16: Automatic Caption Localization in Compressed Video By Yu Zhong, Hongjiang Zhang, and Anil K. Jain, Fellow,…](https://reader035.vdocument.in/reader035/viewer/2022070615/5a4d1bcb7f8b9ab0599d6f2a/html5/thumbnails/16.jpg)
Summary
Proposed caption localization method For compressed video Fast
Further development is needed to improve: Accuracy Support other compression methods