![Page 1: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/1.jpg)
Fast Block Motion Estimation With 8-Bit Partial Sums Using
SIMD Architectures
Presented by: •Ahmed Abdel-Hafeez•Ahmed El-Bohy•Ahmed Emam•Ahmed Kandil
Supervised by/Presented to: Pf.Dr. Attalah Hashaad
Published by: Chunjiang J. Duanmu et. al. Published in August 2007.
![Page 2: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/2.jpg)
2
Outline• Abstract.• Introduction.• 8-bit partial sums.• Multilevel 8-bit partial sums.• Computational complexity.• Simulation Results.• Conclusion.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 3: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/3.jpg)
3
Abstract• Fast block motion estimation algorithms are needed for real-time
implementations of video coding standards due to the high computational complexity of the full-search algorithm for block motion estimation.
• In this paper, an algorithm using 8-bit partial sums of 16 luminance values for a fast block motion estimation is proposed. The technique of using the partial sums is employed to reduce the computational complexity of not only the full search algorithm but also some of the fast block motion estimation algorithms while maintaining their accuracy.
• Furthermore, it is shown that the byte-type data-parallelism on an SIMD architecture can be utilized to access and process these partial sums concurrently to accelerate the process of motion estimation.
• Simulation results are presented to demonstrate that the use of the partial sums can accelerate the execution of the full-search and another search algorithms on an SIMD architecture significantly.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 4: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/4.jpg)
4
Introduction- - Applications
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
Basics
![Page 5: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/5.jpg)
5
Chronological Table of Video Coding StandardsThe objective of video coding is to compress moving images
H.261
(1990)
MPEG-1
(1993)
H.263
(1995/96)
H.263+
(1997/98)
H.263++
(2000)
H.264
( MPEG-4
Part 10 )
(2002)MPEG-4 v1
(1998/99)MPEG-4 v2
(1999/00)MPEG-4 v3
(2001)
1990 1992 1994 1996 1998 2000 2002 2003
MPEG-2
(H.262)
(1994/95)ISO/IEC
MPEG
ITU-TVCEG
![Page 6: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/6.jpg)
6
Introduction-Basics- VideoFrame 1 Frame 2 Frame 3 Frame 4
Luminance (Y) : Describes the brightness of the pixel.
Chrominance (CbCr) : Describes the color of the pixel.
Frame
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 7: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/7.jpg)
7
Introduction-Basics- Video Data Drawback
• An uncompressed video data is big in size.– This is due to data redundancy, there are two
general types of data redundancy in a video:
Spatial redundancy
In a frame, adjacent pixels are usually correlated. e.g. - The grass is green in the background of a frame.
Frame 1 Frame 2 Frame 3 Frame 4
Time based redundancy
In a video, adjacent frames are usually correlated. e.g. - The green background is persisting frame after frame.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 8: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/8.jpg)
8
• Predict current frame based on previously coded frames
• Types of coded frames:– I-frame – Intra-coded frame, coded independently of all
other frames– P-frame – Predictively coded frame, coded based on
previously coded frame– B-frame – Bi-directionally predicted frame, coded based on
both previous and future coded frames
Introduction-Basics- Video Compression
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 9: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/9.jpg)
9
Block Matching
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 10: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/10.jpg)
10
• What is Motion Estimation?– Predict current frame from previous
frame– Determine the displacement of an object
in the video sequence– The amount of data to be coded can be
reduced significantly if the previous frame is subtracted from the current frame.
Motion Estimation
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 11: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/11.jpg)
11
Block Based Motion Estimation Algorithms
Time-domain Algorithms Frequency-domain Algorithms
Matching Algorithms Gradient Based Algorithms
Block-MatchingFeature-matching
Pel-recursive Block-recursive Phase-correlation (DFT)
Matching in (DCT) domain
Matching in wavelet domain
Mesh Based Motion Estimation Algorithms
Motion Estimation Classification
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 12: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/12.jpg)
12
Motion Estimation (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 13: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/13.jpg)
13
Motion Estimation (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 14: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/14.jpg)
14
Motion Estimation (ctd)
Reference Frame
Current Frame
Current 16x16 Block
Mot
ion
Vecto
r
Search Window
Sum of Absolute Difference (SAD)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 15: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/15.jpg)
15
• CCF(Cross-Correlation Function)
• MSE(Mean Square Error Function)
• MAE(Mean Absolute Error)
• SAD(Sum of Absolute Difference)
• PDC(Pixel Difference Classification)
• MAE(or MAD,SAD are commonly employed due to their simplicity in hardware implementation)
Distortion Criterion for measuring distance between previous block and search area block
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 16: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/16.jpg)
16
SAD(dx,dy) =
(MVx, MVy) = min (dx,dy)ЄR2 SAD(dx,dy)
1 1
1 |),(),(|Nx
xm
Ny
ynkk dyndxmInmI
SAD
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 17: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/17.jpg)
17
Search Algorithms
Search Algorithms
FAST
MULTISTEP
3SS 4SS HBS UDS
EXHAUSTIVE
SE MSE VF PFGSE
FULL
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 18: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/18.jpg)
18
Search Algorithms (ctd)
• There is a trade-off between the run time and the accuracy.
• Full search will be most accurate because of exhaustive search, but will require more time
• Fast search is faster but the accuracy will be reduced because of estimation algorithms.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 19: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/19.jpg)
19
Full-Search
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
not suitable for real time.
![Page 20: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/20.jpg)
20
•Simplest algorithm, but computationally most expensive
Exhaustive Search
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 21: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/21.jpg)
21
Three Step Search (3SSA)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 22: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/22.jpg)
22
Three Step Search (3SSA) (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 23: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/23.jpg)
23
Three Step Search (3SSA) (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 24: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/24.jpg)
24
Three Step Search (3SSA) (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 25: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/25.jpg)
25
3SSA Block Matching
► Three-Step Search (3SS)– 9 Points: Central point & its 8
surroundings– Distance: w/2– Find the best match– Use previous best as center– Half distance, select 8 new– Repeat algorithm 3 times– Examines 25 points– Assumes a uniform
distribution of MV’s
1
1
11
11
1 1
1
23
2
2
222
2
2333 3 3
33
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 26: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/26.jpg)
26
4SSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 27: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/27.jpg)
27
Unrestricted center-bitiased Diamond Search Algorithm (UDSA)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 28: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/28.jpg)
28
Hexagon-Bitased search algorithm (HBSA)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 29: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/29.jpg)
29
Problem Definition
• The high computational requirement of the Full Search (FS) algorithm does not allow it to work in real time applications, despite its high accuracy.
• Fast Block motion estimation algorithms have lower computational complexity, but lower accuracy.
• Since, fast block motion estimation are chosen for real time applications Hence in this paper too.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 30: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/30.jpg)
30
Aim
• To improve the accuracy of some of the fast block motion estimation techniques without increasing the computational complexity.
• To make best use of Single Instruction Multiple Data (SIMD) architecture and to take advantage of byte-type data-parallelism to further accelerate the execution of the algorithms to achieve the main goal.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 31: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/31.jpg)
31
Limitation
• If the partial sums for an algorithm is more than 8 bits for a reference block cannot be put, accessed, and manipulated in a contiguous memory space, since there are partial sums of other reference blocks lying in between; due to this, a large number of CPU cycles are lost in manipulating these data. As a consequence, these algorithms are not suitable for SIMD implementations.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 32: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/32.jpg)
32
Procedure
• Devise a scheme that uses only 8 bit partial sum and discard as many SAD computations as possible, without excluding the optimal motion vector.– The proposed partial sums can not only be utilized
in the full-search algorithm as well as in some of the fast block motion-estimation algorithms.
• Devise a scheme that generalises the previous scheme to multi-level case and optimally utilise it.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 33: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/33.jpg)
33
Partial Sums
268+ 483
600Add the hundreds (200 + 400)
Add the tens (60 +80) 140Add the ones (8 + 3)
Add the partial sums(600 + 140 + 11)
+ 11751
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 34: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/34.jpg)
34
8 Bit Partial Sums- Objective
• The objective of this paper is to find new partial sums of only eight bits, so that they can be of the packed byte-type on an SIMD architecture.
• In this way, eight additions or subtractions, for the partial sums can be executed in one SIMD instruction
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 35: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/35.jpg)
35
8-bit Partial Sums 0123456789101112131415
16 X 16
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide∑(n)
![Page 36: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/36.jpg)
36
Lower Bound
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
using
![Page 37: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/37.jpg)
37
Scheme One- Algorithm
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
• Step 1) Initialization a) Compute all of the 8-bit partial sums of
sixteen luminance values for the current frame and save them in a contiguous memory space.
b) Retrieve all the 8-bit partial sums of sixteen luminance values for the reference frame in a saved contiguous memory
![Page 38: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/38.jpg)
38
Scheme One- Algorithm (ctd)
• Step 2) For every current block, execute the block motion-estimation process. – Step 2.1) Initialization
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 39: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/39.jpg)
39
Scheme One- Algorithm (ctd)
– Step 2.2) Search • For (each search location of in a motion-
estimation algorithm)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 40: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/40.jpg)
40
Scheme One- Flow Chart
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 41: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/41.jpg)
41
Multilevel 8-bit Partial Sums
16 X 16
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 42: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/42.jpg)
Multi-level Visualisation
![Page 43: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/43.jpg)
Multi-level Visualisation
![Page 44: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/44.jpg)
Multi-level Visualisation (ctd)
![Page 45: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/45.jpg)
Multi-level Visualisation (ctd)
![Page 46: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/46.jpg)
Multi-level Visualisation (ctd)
![Page 47: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/47.jpg)
Multi-level Visualisation (ctd)
![Page 48: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/48.jpg)
Multi-level Visualisation (ctd
![Page 49: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/49.jpg)
49
Partial Sum Pyramid
Partial Sum Pyramid
8 x 16
4 x 16
2 x 16
1 x 16
Level 1 Level 2 Level 3 Level 4ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 50: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/50.jpg)
50
Multilevel 8-bit Partial Sums- Upper Bound (UB)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
.
![Page 51: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/51.jpg)
51
Scheme Two Algorithm
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
• Step 1) Initialization a) Compute all of the 8-bit partial sums of levels
one and four for the current frame and save them in a contiguous memory space.
b) Retrieve all of the 8-bit partial sums of levels one and four for the reference frame in a saved contiguous memory space.
![Page 52: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/52.jpg)
52
Scheme Two Algorithm (ctd)
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
• Step 2) For every current block, execute the block motion-estimation process. – Step 2.1) Initialization
![Page 53: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/53.jpg)
53
Scheme Two Algorithm (ctd)– Step 2.2) Search
• For (each search location of in a motion-estimation algorithm)
![Page 54: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/54.jpg)
54
Scheme Two- Flow Chart
![Page 55: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/55.jpg)
55
Possible Conditions
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
Condition 1:
Condition 2:
Condition 3:
Condition 4:
![Page 56: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/56.jpg)
56
Possible Combinations
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 57: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/57.jpg)
AVERAGEEXECUTION TIME(INMILLISECONDS)PERFRAME FORVARIOUSMETHODS
Results
57ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 58: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/58.jpg)
58
Possible Combinations
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 59: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/59.jpg)
59
SIMD
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 60: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/60.jpg)
60
COMPUTATIONAL COMPLEXITY AND AVERAGE NUMBER OF CPU CYCLES PER BLOCK USING FSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 61: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/61.jpg)
61
COMPUTATIONAL COMPLEXITY AND AVERAGE NUMBER OF CPU CYCLES PER BLOCK USING SEA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 62: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/62.jpg)
62
COMPUTATIONAL COMPLEXITY AND AVERAGE NUMBER OF CPU CYCLES PER BLOCK USING 3SSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 63: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/63.jpg)
63
COMPUTATIONAL COMPLEXITY ANDAVERAG ENUMBER OF CPU CYCLES PER BLOCK USING 4SSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 64: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/64.jpg)
64
COMPUTATIONAL COMPLEXITY AND AVERAGE NUMBER OF CPU CYCLES PER BLOCK USING UDSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 65: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/65.jpg)
65
COMPUTATIONAL COMPLEXITY AND AVERAGE NUMBER OF CPU CYCLES PER BLOCK USING HBSA
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 66: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/66.jpg)
66
THE PERCENTAGE OF SPEEDUP OFFERED BY SIMD IMPLEMENTATION FOR A MOTION ESTIMATION ALGORITHM WITH SCHEME 2 INCORPORATED
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 67: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/67.jpg)
67
Conclusion
Introduced a new technique of 8 bit partial sum.
The partial sums were used to make best use of SIMD architecture, and hence improving the speed of motion estimation algorithm.
Since these partial sums have the characteristic of having only 8 bits, eight of them can be processed concurrently using a single 64-bit SIMD register.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 68: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/68.jpg)
68
Conclusion The notion of the 8-bit partial sums has then been
extended to the four-level case and shown that there are 15 possible methods of utilizing these multilevel partial sums to accelerate the block motion-estimation algorithms without any loss of accuracy.
The full-search algorithm has then been used to determine as to which one of these 15 methods would provide the lowest computational complexity in order for it to be chosen to accelerate various motion-estimation algorithms.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 69: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/69.jpg)
69
Conclusion
Extensive simulations have been carried out to find the average number of CPU cycles needed per block for various algorithms incorporating the chosen method.
These simulations have shown that the proposed scheme is capable of providing a substantial speed-up for the various existing motion-estimation algorithms through the reduction of their computational complexities.
The simulation results also demonstrate that the implementation on an SIMD architecture can further accelerate the proposed scheme by more than 93%.
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 70: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/70.jpg)
70
References1. “FPGA Implementation of a Novel, Fast Motion Estimation Algorithm for Real-Time Video
Compression”, FPGA 2001, CA. USA, S. Ramachandran and S. Srinivasan, Feb. 20012. “Image & Video Compression for Multimedia Engineering”, Y.Q. Shi and H. Sun, 20003. “A New Diamond Search Algorithm for Fast Block-Matching Motion Estimation”, IEEE Trans. Image
Processing, S. Zhu and K. K. Ma, Feb. 20004. “A Novel Four-Step Search Algorithm for Fast Block Motion Estimation”, IEEE Trans. Circuits System,
Video Technology, L. M. Po and W. C. Ma, June 19965. “Successive Elimination Algorithm for Motion Estimation” W. Li and E. Salari IEEE Trans. , Jan. 19956. “A New Three-Step Search Algorithm for Block Motion Estimation”, IEEE Trans. Circuits System,
Video Technology, R. Li, B. Zeng, and M.L. Liou, Aug. 19947. “Predictive Coding Based on Efficient Motion Estimation”, IEEE Trans. on communications, R.
Srinivasan, K.R. Rao, Aug. 19858. “Motion Compensated Inter-Frame Coding for Video-Conferencing”, T. Koga, K. Iinuma, A. Hirano, Y.
Iijima, and T. Ishiguro, Proc. NTC81, Nov. 19819. “Displacement Measurement and its Applications”, IEEE Trans. on communications, J.R. Jain and
A.K Jain, Dec. 1981
ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide
![Page 71: Fast block motion estimation with 8 bit partial sums using SIMD architecture](https://reader035.vdocument.in/reader035/viewer/2022062512/554a42a7b4c90582328b52f2/html5/thumbnails/71.jpg)
71ARAB ACADEMY-CAIRO Fast Block Motion Estimation With 8-Bit Partial Sums Using SIMD Architectures spring 2013 slide