xdv

University of Texas at Arlington Electrical Engineering Department

Multimedia ProcessingSpring 2008

Sarnoff Vision Optimized MPEG-2 Encoderby Jennie Gloria Abraham

IntroductionTechnology Digital era Image/Video/Audio Huge amounts of Data Compression Compression lose data selectively Human Visual System (HVS) Visual Communication System (VCS) Limitations of Human Vision exploit

=> Perceptual Coding Techniques

Perceptual Coding Technique(Digital) images Image processing -- based on end application

Image Processing Tasks

End application -- meant for HVS

Human Visual System (HVS) (end recipient)

HVS -- physiological and psychological limitations.

Basic Principle in Perceptual coding technique - consider all the data that humans cannot perceive as superfluous data, and discard them.

Video Quality Measurement(1) Objective Measures :

[2]

(2) Subjective Measures :

Distortion calculated by mathematical (predefined) function

Quality is evaluated by humansEg.: DSISM, SSM

Eg.: RMSE and PSNR

Consistent, easy to implement, easy to replicate. Mediocre predictor of perceived quality.

Generally inconsistent between individuals, needs impractical elaborate setups and lack objectivity. Best predictor of Visual Quality

Objective Picture Quality Measurement

[2]

Two Images with same PSNR

Subjective Quality Assessment MeasuresBest predictor of actual image quality

Subjective Quality Assessment - by human subjects

Need for reliable and repeatable tests.

Some internationally accepted test methods :1. 2. 3. 4. 5.

Double Stimulus Impairment Scale Method (DSISM) Double Stimulus Quality Scale Method (DSQSM) Comparison Scale Method (CSM) Single Stimulus Method (SSM) Continuous Quality Evaluations (CQE)

Subjective Quality Measure eg. DSISM

[33]

Subjective Quality Measure eg. DSCQS

[33]

Vision Models

[34]

The Digital Video Consortium (DVC) Sarnoff Corporation Team leader, MPEG algorithms and Image Quality Metrics (JNDmetrix) technology

Tektronix Board design and MPEG vision-optimized encoder implementation

Bell Atlantic Network Services Ensure applicability to video networks

JND as Quality Metric [2]

correlates well across scene types, unlike MSE ...

Quality Measurement [2]

MSE: 27.10 JND = 0.5

MSE: 21.26 JND = 2.5

. differing quality interpretations

JND Model Architecture Overview

[2]

Sarnoff JND Vision Model

[8]

Sarnoff VDM Algorithm Overview

[8]

Sarnoff VDM Front-end Processing

[8]

Sarnoff VDM Luma Processing

Sarnoff VDM Chroma Processing

JND Model Algorithm Flowchart

Sarnoff VDM Output JND Map

[2]

Image 1

JND Map

Image 2

... quantifying visible differences between two images ...

Just-Noticeable Difference (JND)

[8]

JND - visibility threshold below which any change cannot be detected by the HVS. Determination of JND Complex and challenging

1JND =75% probability of seeing a difference.2JND = 0.75+0.75*(1-0.75)=93.75% probability that difference is perceptible and so on..

JND map JND values units of JNDs => represents the difference between two images.

Just-Noticeable Difference (JND)Practical meaning of JNDs

1JND : differences are barely visible, cannot be distinguished, even when exact nature and location of differences are known in advance. 3JNDs : differences visible to detailed observation, not obvious only when an observer knows exactly where to look. 5JNDs : differences are clearly visible readily apparent.

MPEG-2 : Moving Pictures Expert GroupCompression capability of MPEG [11]

Popularity of MPEG-2

Objective:Study the Sarnoff Vision Optimized MPEG-2 Encoder which uses the Sarnoff JND Vision Model to improve the perceived image quality at low bitrate.

Motivation:

Validity if Subjective Quality AssessmentAvailability of a computational model of the HVS Impressive performance of Sarnoff JND Vision Model Possibility of increased compression with better visual quality Popularity of MPEG-2 Performance of Sarnoff Vision Optimized MPEG-2 Encoder

HVS based Perceptual Video Encoders

[2]

MPEG-2 - Some ConceptsVideo sequence three types of frames:

Intra-coded (I), Forward predicted (P), Bidirectionally predicted (B).

Frame Slices Macroblocks (16x16) 4 Blocks (8x8)(Y- only)

Sarnoff VDM Embedded MPEG-2 Decoder

JND Metrics

JND parameters

Sarnoff VDM Embedded MPEG-2 Encoder

[1]

Application of JND Model to MPEG-2Application of JND Model to MPEG2 can be classified into following 3 categories:

Apply JND as a MPEG-friendly prefilter to maskable regions. Macroblock-level multi-pass JND. Frame-level feed-forward JND for real-time IPB quality equalization.

Vision optimized encoder (VOE) -

Macroblock level control

VOE- Picture level Control

[1]

Other Encoder Enhancements

Preprocessing features 3:2 pull-down detection scene change detection flash frame detection noise reduction Fade-to-black correction

Fast hierarchical motion estimation ~1000 times faster than full search with equivalent video qualityImproved mode decision using motion vector bits favored skipped MB at low bit rate Improved rate control adaptive Rate-Distortion model

Vision Optimized Preprocessing

Vision Optimized Preprocessing

Side information is inserted into image sequence, when being encoded.or,

Side information can be stored on a storage device, then made available to an encoder.

The encoder utilizes side information to best select one or more coding parameters.

Vision Optimized PreprocessingExtracted side information can be used to select these coding parameters Frames until Next Scene Change

Degree of Motion Anomalous Frame Detection Fade-Out Detection Complexity of the Next N Frames 3:2 Pull-down Advice Bits needed to Encode this frame at Constant Quantization Scale/Quality

Bits needed to Encode P or B Frame assuming various I (& P) Quality LevelsNoise Filtering/Quantization Matrix

Vision Optimized Preprocessing Side Information

Frames until Next Scene ChangeNormal Order :IBBPBBPBBI BB P Without Side Info. : I B B P B B P B B I B BSC I Scene Cut & Side Info.: I B B P B B P B B P* B BSC I

Anomalous Frame Detection - Eg. flash bulb lit scenesAvoid coding as a P or I frame,

Fade-Out Detection - Brightness fades out for each successive framesFade to Black Correction in Motion Estimation

Noise Filtering

Other Encoder Enhancements

Adaptive Q Matrix

Frame based Linear/Non-linear Mquant selectionGraceful degradation for Panic Mode support Complexity based GOP structure

Quantization Matrix Adaptation for VOE

Generate DCT Map for previous frame of same type

Generate Q Matrix for current frame

Determine slope of DCT Map diagonal for previous frame

Adjust Q Matrix

Determine slope of Q Matrix diagonal

Quantize blocks of DCT Coeffs. for current frame

The algorithm comprises two parts: (1) the shape adaptation of the quantization matrix and (2) the mean adjustment of the Q matrix.

Quantization Matrix Adaptation - DCT Map Generation

DCT Map

40 (8x8) blocks of DCT Coeff.

Panic Mode Support Scene Change within a GoP

Fig. illustrates the P pictures (P1 and P3) that are worst affected by scene cuts within an MPEG GoP [ref]

Reducing bits allocated to surrounding B pictures increase P frame bit allocation avoid VBV underflow B picture errors do not carry over to any other pictures within the group of pictures safely spread out over several B pictures.

Panic Mode Support Dynamic rate Control [P1]

Neural Network based Encoder Parameter Control[P1]

Motivating Results Vision Optimized Preprocessing[2]

Motivating Results Vision Optimized I P B Bit Allocation[2]

Motivating Results Vision Optimized Mquant Control[2]

Conclusions

Vision based optimization gives significant improvement in image quality below 2 Mbits/sec

[2]

The DVC vision optimized encoder algorithms include:

Adaptive pre-processing to select features for bit allocation Control quality among frame types - IPB rate control Optimized bit allocation within a frame Mquant optimization

Balanced encoder incorporating improvements to all aspects of the encoder Most approaches used are directly applicable to MPEG-4 encoding

Future Research and Extension

[15]

Extend Sarnoffs VDM to H.264 encoder

Future Research and Extension

Prototypical H.264 model with JND matrix [15]

References[1] H.R. Wu and K.R. Rao, Digital Video Image Quality and Perceptual Coding, Boca Raton, FL: CRC Press, 2006. [2] A. Pica, Making Every bit Count Vision Optimized Encoding, Sarnoff Corp. [3] K.R. Rao and J.J. Hwang Techniques and Standards for Image, Video, Audio Coding, Upper Saddle River, NJ: Prentice Hall, 1997. [4] Y. Jia, W. Lin and A.A. Kassim, Estimating Just-Noticeable Distortion for Video, IEEE Trans. on Circuits and Systems for Video Technology, Vol. 16, pp. 820-829, July 2006. [5] MPEG-2 reference software ISO/IEC 13818-5:2005, http://standards.iso.org/ittf/licence.html [6] MPEG homepage - http://www.chiariglione.org/mpeg/ [7] MPEG reference website - http://www.mpeg.org [8] J. Lubin, A Human Vision System Model For Objective Picture Quality Measurements, International Broadcasting Convention, pp. 498-503, 12-16 Sept., 1997. [9] J. Lubin, Just Noticeable Difference Analysis: How and Why We Measure and Model the Visibility of Differences between Images, Sarnoff Corp. [10] Measuring Image Quality: Sarnoff's JNDmetrix Technology, Sarnoff Corp. [11] MPEG2 Overview - http://www.erg.abdn.ac.uk/research/future-net/digital-video/mpeg2.html [12] S. Winkler, Issues in vision modeling for perceptual video quality assessment, Signal processing, 78-231, 1999.

References

(contd)

[25]S. Daly, The visible difference predictor: An algorithm for the assessment of image fidelity, in Digital Images and Human Vision, A. B. Watson, ed., pp. 179-206, MIT Press, 1993. [26] J. Lubin, A Visual Discrimination Model for Imaging System Design and Evaluation, Vision Models for Target Detection and Recognition, Eli Peli, Editor, World Scientific, New Jersey, pp. 245-283, 1995. [27] Van den Branden Lambrecht and J. Farrell, Perceptual quality metric for digitally coded color images, Proceedings of the VIII European Signal Processing Conference EUSIPCO, pp. 1175-1178, 1996. [28] M. Masry, and S. S. Hemani, An Analysis of Subjective Quality in Low Bit Rate Video, Proc. IEEE ICIP Thessaloniki, Greece, pp. 465-468, 2001. [29] S. Winkler, C. van den Branden Lambrecht and M. Kunt, Vision and Video: Models and Applications, Ecublens : EPFL, 2001. [30] P. Lindh and C. van den Branden Lambrecht, Efficient spatio-temporal decomposition for perceptual processing of video sequences, IEEE Proceedings of International Conference on Image Processing ICIP'96, Vol. 3, pp. 331-334, 1996. [31] S. Winkler, Issues in vision modeling for perceptual video quality assessment, Signal Processing, Vol. 78, Nr. 2, pp. 231-252, 1999. [32] S. Winkler, Visual fidelity and perceived quality: Towards comprehensive metrics, in Proc. SPIE Human Vision and Electronic Imaging Conference, Vol. 4299, pp. 114-125, 2001.

References

(contd)

[13] S. Winkler infoscience.epfl.ch/record/61769/files/Winkler2000_653.pdf [14] Human Visual System - http://www.ecs.csun.edu/~dsalomon/DC2advertis/AppendH.pdf [15] J. Wang, VDM with H.264, a project proposal, UTA [16] JND - http://en.wikipedia.org/wiki/Just_noticeable_difference [17] Human visual systemimage formation http://vision.berkeley.edu/roordalab/Pubs/EISTChapterRoorda.pdf [18] The human visual system - http://www.dip.ee.uct.ac.za/~nicolls/lectures/eee401f/hvs.pdf [19]Z. Wang and A. C. Bovik, A Human Visual System-Based Objective Video Distortion Measurement System http://www.cns.nyu.edu/~zwang/files/papers/icmps.pdf [20] S. Winkler, Digital video quality : vision models and metrics, Hoboken, NJ : John Wiley & Sons, 2005. [21] G. Westheimer, The eye as an optical instrument. In K. R. Boff, L. Kaufman, J. P. Thomas (eds.), Handbook of Perception and Human Performance, vol. 1, chap. 4, John Wiley & Sons, 1986. [22] D. C. Hood and M. A. Finkelstein. Sensitivity to light. In K. R. Boff, L. Kaufman, J. P. Thomas (eds.),

Handbook of Perception and Human Performance, vol. 1, chap. 5, John Wiley & Sons, 1986.[23] B. E. Rogowitz, The human visual system: A guide for the display technologist. In Proceedings of the Society for Information Display, vol. 24/3, pp. 235252, 1983. [24] Human Visual System -http://www.ecs.csun.edu/~dsalomon/DC2advertis/AppendH.pdf

References

(contd)

[33] S. Winkler, Quality metric design: A closer look, Proc. SPIE Human Vision and Electronic Imaging Conference, Vol. 3959, pp. pp. 37-44, 2000. [34] S. Winkler, A perceptual distortion metric for digital color video, Proceedings of the SPIE Conference on Human Vision and Electronic Imaging, Vol. 3644, pp. 175-184, 1999. [35] S. Winkler, Visual quality assessment using a contrast gain control model, Proceedings of the IEEE Workshop on Multimedia Signal Processing (MMSP), pp. 527-532, 1999. [36] S. Winkler and P. Vandergheynst, Computing isotropic local contrast from oriented pyramid decompositions, IEEE in Proceedings of the 6th International Conference on Image Processing (ICIP), Vol. 4, pp. 420-424, 1999.

[37] M. Rohaly et al, Video Quality Experts Group: Current results and future directions, Proc. SPIE Visual Communications and Image Processing, Vol. 4067, pp. 742-753, June 2000.[38] I. D. Basso, F. Tobagi and C. van den Branden Lambrecht, Study of MPEG-2 coding performance based on a perceptual quality metric, Proceedings of the Picture Coding Symposium, pp. 263268, March 1996. [39] van den Branden Lambrecht and O. Verscheure, Perceptual quality measure using a spatiotemporal model of the human visual system, Proceedings of the SPIE, Vol. 2668, pp. 450-461, Jan. 1996. [40] T. Ebrahimi and C. Horne, MPEG-4 natural video coding - An overview, Signal Processing: Image Communication, Vol. 15, Nr. 4-5, pp. 365-385, 2000.

References

(contd)

[41] J. Lubin, A human vision system model for objective picture quality measurements, International Broadcasting Convention, pp. 498 503, Sept. 1997. [42] Sarnoff Corporation, JND Research - http://www.sarnoff.com/research-and-development/videocommunications-networking/video/just-noticeable-difference. [43] M. H. Brill and J. Lubin, Report : Sarnoff JND vision model for flat-panel design http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.gov/19980151087_1998151455.pdf. [44] J. Watkinson, MPEG handbook : MPEG-1, MPEG-2, MPEG-4 , 2nd ed., Focal Press, 2004. [45] K. Jack, Video demystified : a handbook for the digital engineer, 5th ed., Burlington,MA :Newnes, 2007. [46] I. E. G. Richardson, Video codec design : developing image and video compression systems , NJ : John Wiley & Sons, 2002. [47] F. Pereira and T. Ebrahimi, MPEG-4 book, Prentice Hall, 2002. [48] I. E. G. Richardson, H.264 and MPEG-4 Video Compression: Video Coding for Next Generation Multimedia, NJ : John Wiley & Sons, 2003. [49] H.264/AVC reference software JM12.1, http://iphome.hhi.de/suehring/tml/. [50] S.K. Kwon, A. Tamhankar and K.R. Rao Overview of H.264 / MPEG-4 Part 10 J. VCIR, Vol. 17, pp. 186-216, April 2006, Special Issue on "Emerging H.264/AVC Video Coding Standard,". [51] Test sequence, ftp://ftp.tnt.uni-hannover.de/pub/svc/testsequences/ [52] Decoder Block Diagram, http://www.altera.com/products/ip/ampp/amphion/images/video-decodercs6651-fig1-pop.pdf

List of Patents

xdv

Documents

international

john wiley

human visual

perceptual

perceptual

vision optimized

electronic

subjective