an early block type decision method for intra prediction in h.264/avc
DESCRIPTION
An Early Block Type Decision Method for Intra Prediction in H.264/AVC. Jungho Do, Sangkwon Na and Chong-Min Kyung VLSI Systems Lab. Korea Advanced Institute of Science and Technology (KAIST). T able of contents. Introduction Intra prediction Early block type decision - PowerPoint PPT PresentationTRANSCRIPT
An Early Block Type Decision Method for Intra Prediction in H.264/AVC
Jungho Do, Sangkwon Na and Chong-Min Kyung
VLSI Systems Lab.Korea Advanced Institute of Science and Technology
(KAIST)
SiPS 2009 2
Table of contents• Introduction• Intra prediction• Early block type decision
– Block type decision point– Rate-distortion cost
• Experimental results• Conclusion• Reference
SiPS 2009 3
Introduction• Two prediction methods in H.264/AVC
– Inter prediction• Using temporal correlation between the earlier frame
and the current frame– Intra prediction
• Using spatial correlation between neighboring blocks in the current frame
• 9 prediction modes
SiPS 2009 4
Introduction• Why do we focus on Intra prediction (IP)?
– Rate-distortion optimization (RDO) in Intra pre-diction also improves an overall rate-distortion performance.
– More block types and prediction modes in Intra prediction than inter prediction
• Constraint: 560 cycles for 1920x1080@30fps (133MHz)
• Total: 640 cyclesSourceInput
Bit Stream
IT IQ
T Q Entropycoder
ㅡ
+
DistortionSSDIntra
prediction
Reconstruction
+
+
S
C
SiPS 2009 5
Introduction• Why do we focus on Intra prediction (IP)?
– Rate-distortion optimization (RDO) in Intra pre-diction also improves an overall rate-distortion performance.
– More block types and prediction modes in Intra prediction than inter prediction
• Constraint: 560 cycles for 1920x1080@30fps (133MHz)
• Total: 640 cyclesIP DCT Q Q-1 IDCT Re-
con.
IP DCT Q Q-1 IDCT Re-con.
IP DCT Q Q-1 IDCT Re-con.
Pipeline for one prediction mode
SiPS 2009
rate[9]-bit estimated :multiplier Lagrangian:
blocktion reconstruc : block, original :difference squared of sum :
),(
R
CSSSD
RCSSSDJRD
SourceInput
Bit Stream
IT IQ
T Q Entropycoder
ㅡ
+Rate
EstimatorEstimated
rate
DistortionSSDIntra
prediction
Reconstruction
+
+
S
CR
Introduction• Rate-distortion cost function for RDO
[9] M.G. Sarwer and L.M. Po, “Bit rate estimation for cost function of 4x4 intra mode decision of h.264/avc,” ICME, 2007, pp. 1579 – 1582. 6
SiPS 2009 7
Intra prediction• Nine prediction modes
– Associated with the direction of prediction0 1
2 3
4 5
6 7
8 9
10 11
12 13
14 15
0 1
2 3
0
0 5743
6
1
8
0 5743
6
1
8
0
3
1
8x8 luma prediction modes
3
0
2
3
1
3
4x4 luma prediction modes
16x16 luma prediction modes
8x8 chroma prediction modes
SiPS 2009 8
Early block type decision• Motivation
– Spatial correlation between sub-blocks and macroblock (MB)
• The prediction mode of neighboring blocks have the similar direction.
– We assume that R-D cost of neighboring blocks also have the similar cost based on the above hypothesis.
• R-D cost of macroblock can be substituted by R-D cost of sub-blocks for mode decision.
SiPS 2009 9
Encoding order• Conventional encoding order
– 8x8 blocks are performed after all 4x4 blocks are performed.
4x4
C8x8
4x44x44x44x44x44x44x44x44x44x44x44x44x44x44x48x88x88x88x8
16x1616
0 1
2 3
4 5
6 7
8 9
10 11
12 13
14 15
0 1
2 3
0
8x8 luma prediction modes
0
4x4 luma prediction modes
16x16 luma prediction modes
8x8 chroma prediction modes
SiPS 2009 10
Encoding order• Proposed encoding order
– To use spatial correlation, the prediction of 8x8 block is performed after the predictions of the corresponding four 4x4 blocks are finished.
• Three possible decision points
4x4 4x4 4x4 4x4
8x8
4x4 4x4 4x4 4x4
8x8
4x4 4x4 4x4 4x4
8x8
4x4 4x4 4x4 4x4
8x8
16x16
C8x8MB41 MB
42 MB
43
0 1
2 3
4 5
6 7
8 9
10 11
12 13
14 15
0 1
2 3
0
8x8 luma prediction modes
0
4x4 luma prediction modes
16x16 luma prediction modes
8x8 chroma prediction modes
SiPS 2009
1st block type culling• 16x16 block type
– Probability of 16x16 block to be decided as the best mode when 4x4 block is selected by proposed early block type decision
– 16x16 block type is bypassed when 4x4 block type is selected.
11
QP 16 20 24 28 32 36
Probability(%) 0 0.16 0.28 0.26 0.4 0.2
11
SiPS 2009 12
●
● ●● ●
●
● ●● ●
●
● ●● ●
●
●
●
JRD,4x4(1/4)< JRD,8x8(1/4)? Yes
No
● ●● ●
2nd block type culling• Block type decision at 1/4MB point
– If JRD,4x4(1/4) < JRD,8x8(1/4)• Prediction of only 4x4 block type is performed evalu-
ated after 1/4MB.– Otherwise
• 8x8 block type is taken. 16x16 block
4x48x8
SiPS 2009 13
●
● ●● ●
●
● ●● ●
●
● ●● ●
●
●
●
JRD,4x4(1/4)< JRD,8x8(1/4)? Yes
No
● ●● ●
2nd block type culling• Block type decision at 1/4MB point
– If JRD,4x4(1/4) < JRD,8x8(1/4)• Prediction of only 4x4 block type is performed evalu-
ated after 1/4MB.– Otherwise
• 8x8 block type is taken. 16x16 block
4x48x8
SiPS 2009 14
Time
● ●● ●
●
● ●● ●
●
● ●● ●
●
●
●
● ●● ●
●
● ●● ●
●
● ●● ●
●
●
●
● ●● ●
●
● ●● ●
●
JRD,4x4(2/4)< JRD,8x8(2/4)e ?
JRD,4x4(3/4)e< JRD,8x8(3/4)e ?
Yes
No
Yes
No
2nd block type culling• Block type decision at 2/4MB and 3/4MB
4x48x8
4x48x8
SiPS 2009 15
Block type decision point• Which block type decision point is the
best?– In terms of R-D performance and the number of
reconstruction loops according to the positions of early block type decision method.
1/4MB 2/4MB 3/4MB
R-D perfor-mance
∆PSNR (dB) -0.039 -0.016 -0.009
∆BR (%) 0.945 0.536 0.476
Cycle counts of recon-struction
loops
4x4 block type 300 360 420
8x8 block type 464 528 592
SiPS 2009 161/4MB 2/4MB 3/4MB
0100200300400500600700
4x4 8x8Normalized Distortion Normalized Bit-rate
Cycl
e co
unts
Block type decision point• Which block type decision point is the
best?– In terms of R-D performance and the number of
reconstruction loops according to the positions of early block type decision method.
Constraint
SiPS 2009
Block type decision point• Proposed early block type decision
– 2/4 MB point is effective in terms of distortion, bit-rate and computational complexity (the number of cycles for MB process).
17
Complexity
Rate (kbps)Distortion (dB)
1/4MB2/4MB3/4MB
Constraint
SiPS 2009 18
R-D cost function• Original cost function
header MB of rate-bit :chroma and luma of rate-bit estimated :,
chroma and luma of distortion :,)()(
),(
header
ChromaLuma
ChromaLuma
ChromaLumaheaderChromaLuma
RD
RRRDD
RRRDDRCSSSDJ
SiPS 2009 19
R-D cost function• Proposed R-D cost for sub-block
– Assumption: the best mode of chroma block is decided independently regardless of luma mode.
chroma) and (luma modes decidedfor rate-bit :, typeMBfor rate-bit :
variable(CBP)k_pattern coded_blocfor rate-bit :
)()()(
modemode
modemode
CL
MB
CBP
CLMBCBPheader
LumaheaderLumaLumaRD
ChromaLumaheaderChromaLumaRD
RRRR
RRRRRRRDJ
RRRDDJ
The best mode of luma block is decided independently.
The difference of RCBP between 4x4 block type and 8x8 block type is frequently very small.The value of RMB is fixed as ‘intra’ for all block types.
SiPS 2009 20
RCBP• Coded_block_pattern (CBP) variable
A B
C D
A B A B
C D C D
A B A B
C D C D
Current MB
(a) Intra_8x8 (b) Intra_4x4
Coded_block_pattern indicates which 8x8 blocks(6) in the MB contain nonzero coefficients
8
LumaChroma
000000
SiPS 2009 21
RCBP• Comparison of RCBP
0 2 4 6 8 10 120
10
20
30
40
50
60
70
80
90Red_kayak in FHD@30fps
QP = 16
QP = 24
QP = 32
QP = 40
Difference of RCBP between 4x4 block type and 8x8 block type
Pro
babi
lity
(%)
SiPS 2009 22
R-D cost function• Simplified R-D cost function
– Proposed R-D cost function enables evaluation of Rheader to be performed at the early decision point (2/4MB) because RHeader can be derived from RLmode
. )(mode LumaLLuma
LumaRD RRDJ
SiPS 2009 23
Experimental results• R-D performance comparison
Full search
Proposed
SiPS 2009 24
Conclusion• Computational complexity problem in
RDO-enabled Intra prediction
• An early decision at 2/4MB point for RDO-enabled Intra prediction– Block type decision based on spatial correlation– R-D cost computation is reduced by 90.1% with
0.93% bit-rate increase and 0.039 dB PSNR de-crease compared to full search.
SiPS 2009
Thank you
25
SiPS 2009 26
Reference• [1] F. Pan, X. Lin, S. Rahardja, K.P. Lim, and Z.G. Li, “A directional field based fast intra mode decision algo-
rithm for h.264 video coding,” ICME, 2004, vol. II, pp. 1147–1150.• [2] A.C. Tsai, J.F. Yang, andW.G. Lin, “Effective subblockbased and pixel-based fast direction detections for
h.264 intra prediction,” Circuits and Systems for Video Technology, IEEE Transactions on, vol. 18, pp. 975–982, July 2008.
• [3] Y.-K. Lin and T.-S. Chang, “Fast block type decision algorithm for intra prediction in h.264 frext,” ICIP, 2005, vol. I, pp. 585–588.
• [4] T. Zhang, G. Tian, and S. Goto, “A frequency-based fast block type decision algorithm for intra prediction in h.264/avc high profile,” IAPCCAS, 2008, vol. II, pp. 1292–1295.
• [5] W. Lee, Y. Jung, S. Lee, and J. Kim, “High speed intra prediction scheme for h.264/avc,” Consumer Elec-tronics, IEEE Transactions on, vol. 53, pp. 1577–1582, Nov 2007.
• [6] G. Jin, J.-S. Jung, and H.-J. Lee, “An efficient pipelined architecture for h.264/avc intra frame processing,” ISCAS, 2007, vol. II, pp. 1605–1608.
• [7] Y.-K. Lin, C.-W. Ku, D.-W. Li, and T.-S. Chang, “A 140-mhz 94k gates hd1080p 30-frames/s intra-only pro-file h.264 encoder,” Circuits and Systems for Video Technology, IEEE Transactions on, vol. 19, pp. 432–436, Mar 2009.
• [8] C.K. Huang and L.L. Youn, “An h.264/avc full-mode intra-frame encoder for 1080hd video,” ICME, 2008, pp. 1037–1040.
• [9] M.G. Sarwer and L.M. Po, “Bit rate estimation for cost function of 4y4 intra mode decision of h.264/avc,” ICME, 2007, pp. 1579 – 1582.
• [10] I.E. Richardson, “Draft itu-t recommendation and final draft international standard of joint video speci-fication(itu-t rec.h.264 -iso/iec14496-10 avc),” Joint Video Team, 2003.
• [11] Z. Kun, Y. Chun, L. Qiang, and Z. Yuzhou, “A fast block type decision method for h.264/avc intra predic-tion,” ICACT, 2007, vol. I, pp. 673–676.
• [12] “Joint video team (jvt) reference software version 14.0,” http://iphome.hhi.de/suehring/tml/download/.• [13] I.E. Richardson, “H.264 and mpeg-4 video compression: video coding for next generation multimedia,”
Chichester, U.K.: Wiley, 2003.
SiPS 2009
Appendix
27
SiPS 2009 28
Introduction• H.264/AVC coding flow
EntropyCoding
Scaling & Inv. Transform
Motion-Compensation
ControlData
Quant.Transf. coeffs
MotionData
Intra/Inter
CoderControl
Decoder
MotionEstimation
Transform/Scal./Quant.-
InputVideoSignal
Split intoMacroblocks16x16 pixels
Intra-frame Prediction
De-blockingFilter
OutputVideoSignal
SiPS 2009 29
Introduction• Why we focus on Intra prediction
– Computational bottleneck in rate-distortion (R-D) optimized encoder due to many block types and prediction modes
• Constraint: 560 cycles for 1920x1080@30fps (133MHz)
• Total: 640 cycles (= 15*16 + 60*4 + 160)– One 4x4 block: 15 cycles – One 8x8 block: 60 cycles– One 16x16 block: 160 cyclesSource
InputBit
Stream
IT IQ
T Q Entropycoder
ㅡ
+Rate
EstimatorEstimated
rate
DistortionSSDIntra
prediction
Reconstruction
+
+
S
CR
SiPS 2009
Early block type decision• Flow
30
4x4 Luma X 4
8x8 Luma
4x4 Luma X 4
8x8 Luma
Early block type decision4x4cost < 8x8cost
4x4 Luma X 4
4x4 Luma X 4
8x8 Chroma X 2
8x8 Luma
8x8 Luma
16x16 Luma
8x8 Chroma X 2
4x4 block 8x8 or 16x16 block
NY
1/4MB
2/4MB
3/4MB