a novel cross-diamond search algorithm for fast block

10
1168 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 A Novel Cross-Diamond Search Algorithm for Fast Block Motion Estimation Chun-Ho Cheung and Lai-Man Po Abstract—In block motion estimation, search patterns with different shapes or sizes and the center-biased characteristics of motion-vector distribution have a large impact on the searching speed and quality of performance. In this paper, we propose a novel algorithm using a cross-search pattern as the initial step and large/small diamond search (DS) patterns as the subsequent steps for fast block motion estimation. The initial cross-search pattern is designed to fit the cross-center-biased motion vector distribution characteristics of the real-world sequences by evaluating the nine relatively higher probable candidates located horizontally and vertically at the center of the search grid. The proposed cross-diamond search (CDS) algorithm employs the halfway-stop technique and finds small motion vectors with fewer search points than the DS algorithm while maintaining similar or even better search quality. The improvement of CDS over DS can be up to a 40% gain on speedup. Experimental results show that the CDS is much more robust, and provides faster searching speed and smaller distortions than other popular fast block-matching algorithms. Index Terms—Block-matching motion estimation, cross-center- biased characteristic, cross-diamond search algorithm. I. INTRODUCTION B LOCK-MATCHING motion estimation is the cardinal process for many motion-compensated video coding standards [1]–[5], in which temporal redundancy between successive frames are efficiently removed. It divides frames into equally sized rectangular blocks and finds out the dis- placement of the best-matched block from the previous frame as the motion vector to the block in the current frame within a search window . However, the motion estimation could be very computational intensive and can consume up to 80% of computational power of the encoder if exhaustively evaluating all possible candidate blocks. In the last two decades, many fast block-matching algorithms (BMA) were proposed for al- leviating the heavy computations consumed by the brute-force full-search algorithm (FS), such as the three-step search (3SS) [6], the new three-step search (N3SS) [7], the four-step search (4SS) [8], the block-based gradient descent search (BBGDS) [9], and the diamond search (DS) [10], [11] algorithms, etc. In 3SS, N3SS, 4SS, and BBGDS, rectangular search patterns of different sizes are employed. As the center-biased global minimum motion vector distribution characteristics, more than 80% of the blocks can be regarded as stationary or quasi-sta- tionary blocks and most of the motion vectors are enclosed in the central 5 5 area for (as depicted in Fig. 1). Based Manuscript received August 15, 2001; revised July 16, 2002. This work was supported by a Grant from City University of Hong Kong, Hong Kong SAR, China (Project 7 001 129). This paper was recommended by Associate Editor D. W. Lin. The authors are with the Department of Electronic Engineering, City Uni- versity of Hong Kong, Kowloon, Hong Kong, China (e-mail: [email protected]; [email protected] Digital Object Identifier 10.1109/TCSVT.2002.806815 (a) (b) Fig. 1. Average MVP distribution: (a) over the search window for the six CIF/SIF sequences and (b) over the search window for the three CCIR601 sequences. on this phenomenon in real-world sequences, N3SS proposes the first step of 3SS to evaluate eight extra neighboring candi- dates and employs halfway-stop technique to achieve significant speedup on sequences with (quasi-)stationary blocks, while 4SS and BBGDS just use smaller square patterns to fit the center-bi- ased motion-vector distribution characteristics of the real-world sequences. Among them, DS employs a diamond-shaped pat- tern and results in fewer search points with similar distortion performance as compared to N3SS and 4SS. Basically, DS per- forms block-matching just like 4SS. It rotates the square-shaped search pattern by 45 to form a diamond-shaped one and with its 1051-8215/02$17.00 © 2002 IEEE

Upload: others

Post on 15-Jun-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Novel Cross-Diamond Search Algorithm for Fast Block

1168 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002

A Novel Cross-Diamond Search Algorithm for Fast Block Motion EstimationChun-Ho Cheung and Lai-Man Po

Abstract—In block motion estimation, search patterns withdifferent shapes or sizes and the center-biased characteristics ofmotion-vector distribution have a large impact on the searchingspeed and quality of performance. In this paper, we propose anovel algorithm using a cross-search pattern as the initial step andlarge/small diamond search (DS) patterns as the subsequent stepsfor fast block motion estimation. The initial cross-search pattern isdesigned to fit the cross-center-biased motion vector distributioncharacteristics of the real-world sequences by evaluating thenine relatively higher probable candidates located horizontallyand vertically at the center of the search grid. The proposedcross-diamond search (CDS) algorithm employs the halfway-stoptechnique and finds small motion vectors with fewer search pointsthan the DS algorithm while maintaining similar or even bettersearch quality. The improvement of CDS over DS can be upto a 40% gain on speedup. Experimental results show that theCDS is much more robust, and provides faster searching speedand smaller distortions than other popular fast block-matchingalgorithms.

Index Terms—Block-matching motion estimation, cross-center-biased characteristic, cross-diamond search algorithm.

I. INTRODUCTION

B LOCK-MATCHING motion estimation is the cardinalprocess for many motion-compensated video coding

standards [1]–[5], in which temporal redundancy betweensuccessive frames are efficiently removed. It divides framesinto equally sized rectangular blocks and finds out the dis-placement of the best-matched block from the previous frameas the motion vector to the block in the current frame within asearch window . However, the motion estimation could bevery computational intensive and can consume up to 80% ofcomputational power of the encoder if exhaustively evaluatingall possible candidate blocks. In the last two decades, manyfast block-matching algorithms (BMA) were proposed for al-leviating the heavy computations consumed by the brute-forcefull-search algorithm (FS), such as the three-step search (3SS)[6], the new three-step search (N3SS) [7], the four-step search(4SS) [8], the block-based gradient descent search (BBGDS)[9], and the diamond search (DS) [10], [11] algorithms, etc.

In 3SS, N3SS, 4SS, and BBGDS, rectangular search patternsof different sizes are employed. As the center-biased globalminimum motion vector distribution characteristics, more than80% of the blocks can be regarded as stationary or quasi-sta-tionary blocks and most of the motion vectors are enclosed inthe central 5 5 area for (as depicted in Fig. 1). Based

Manuscript received August 15, 2001; revised July 16, 2002. This work wassupported by a Grant from City University of Hong Kong, Hong Kong SAR,China (Project 7 001 129). This paper was recommended by Associate EditorD. W. Lin.

The authors are with the Department of Electronic Engineering, City Uni-versity of Hong Kong, Kowloon, Hong Kong, China (e-mail: [email protected];[email protected]

Digital Object Identifier 10.1109/TCSVT.2002.806815

(a)

(b)

Fig. 1. Average MVP distribution: (a) over the search window�7 for the sixCIF/SIF sequences and (b) over the search window�15 for the three CCIR601sequences.

on this phenomenon in real-world sequences, N3SS proposesthe first step of 3SS to evaluate eight extra neighboring candi-dates and employs halfway-stop technique to achieve significantspeedup on sequences with (quasi-)stationary blocks, while 4SSand BBGDS just use smaller square patterns to fit the center-bi-ased motion-vector distribution characteristics of the real-worldsequences. Among them, DS employs a diamond-shaped pat-tern and results in fewer search points with similar distortionperformance as compared to N3SS and 4SS. Basically, DS per-forms block-matching just like 4SS. It rotates the square-shapedsearch pattern by 45to form a diamond-shaped one and with its

1051-8215/02$17.00 © 2002 IEEE

Page 2: A Novel Cross-Diamond Search Algorithm for Fast Block

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 1169

TABLE IIMAGE SEQUENCESUSED FORANALYSIS

size kept unchanged throughout the search before the new min-imum block distortion measure (BDM) reaches the center of thediamond. The merits that DS yields faster searching speed canbe regarded as: 1) the diamond-shaped pattern, which tries to be-have as an ideal circle-shaped coverage for considering all pos-sible directions of an investigating motion vector and 2) fewerchecking points in the final converging step (only four insteadof eight, as compared to square-shaped pattern BMA like N3SSand 4SS.) Without any exception, all conventional fast BMAsare based on the convexity of uni-modal error surface assump-tion of the BDM [12]: the BDM of the matching blocks in-creases monotonically away from the global minimum distor-tion. To minimize the distortion trapped by local minima, DSkeeps unrestricted number of steps instead of step-size conver-gence during advancing to the subsequent optimal point of thesearch pattern. In this paper, we propose a novel fast BMAscalled cross-diamond search (CDS) algorithm by introducing across-shaped search pattern (CSP) as the initial step, instead ofthe diamond-shaped, to the DS algorithm. Experimental simu-lations show that it can achieve fewer search points over otherfast BMA and can maintain similar or even smaller distortionerror. The rest of the paper is organized as follows. Section II an-alyzes the motion vector distribution characteristics. The CDSalgorithm will be described in Section III. Section IV reportsthe significant experimental results on CDS, and conclusionsare given in Section V.

II. CROSS-CENTER BIASED (CCB) MOTION

VECTORDISTRIBUTIONS

For an in-depth analysis on the motion vector probabilities(MVP) distributions, nine well-known sequences, listed inTable I, consist of different motion content are exhaustivelysimulated using FS with spiral block-matching style and meanabsolute distortion (MAD) as the BDM is employed. Video-conferencing sequences such as “Miss America”, “Salesman”,and “Claire” consist of gentle, smooth, and low-motion contentwith a (quasi-)stationary background. Fast motion with camerazooming can be found in the sequence “Tennis”, while camerapanning with translation and rigorous motion content canbe found in sequence “Garden” and “Football”, respectively.As the latter three sequences involve higher motion content,both SIF and CCIR601 formats are included. The notations ofprobabilities used for measuring motion vectors and differentpatterns are listed in Table II. The average MVP distributionusing the six SIF/CIF sequences with , and the threeCCIR601 sequences with are cumulated at thecorresponding positions of the one-fourth search window, andare tabulated in Tables III and IV, respectively.

TABLE IINOTATIONS OF PROBABILITY USED FOR MEASURING

MOTION VECTORS ANDDIFFERENTPATTERNS

TABLE IIIAVERAGE MVP DISTRIBUTION MEASURED ATABSOLUTEDISTANCE FROM THE

CENTER OF THESEARCH GRID USING SIX CIF/SIF IMAGE SEQUENCES

FOR SEARCH WINDOW �7

As square-shaped search grid, which is originated at the cen-tral zero motion position/vector (ZMP/ZMV), is usually em-ployed as a “square-shaped region” with radius1 tocover the search range from negative to positive search pointsin both horizontal and vertical directions. Thus, square occur-rence (Probability or ) is used to refer to the sum ofMVP, in which the search points are located on the square-re-gion with pels from different directions away from the ZMP. Inaddition, cumulated square occurrence ( or ) refersto the total MVP of the central area, i.e.,the square-center-biased (SCB) occurrence. Similarly, the dia-mond occurrence is the sum of MVP located on thediamond-shaped-region with orthogonal vertices ofpels fromthe ZMP. Then, the cumulated diamond occurrence (or ) is referred to the total MVP of the central diamond-shaped area, i.e., the diamond-center-bi-ased (DCB) occurrence. On the other hand, cross occurrence( ) is referred to as the sum of the four MVP located ona cross-shaped region withpels orthogonally away from the

1In the discrete rectangular search grid, “radius” is usually refered to the dis-placement with same number of pels (checking points) from different directionsaway from the central zero motion vector position.

Page 3: A Novel Cross-Diamond Search Algorithm for Fast Block

1170 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002

TABLE IVAVERAGE MVP DISTRIBUTION MEASURED AT ABSOLUTE DISTANCE FROM THE CENTER OF THESEARCH GRID

USING THREE CCIR601 IMAGE SEQUENCES FORSEARCH WINDOW �15

ZMP. A cross-shaped region with radius of 2 pels isshown in Fig. 3(a). The cumulated cross occurrence ( or

) is then referred to the total MVP along the CSP at thecenter with -pel wing from the ZMP, i.e., the CCB occurrence.

In Table III, about 81.80% of the motion vectors are foundlocated in the central 5 5 area, i.e., pels. Moreover,77.52% and 74.76% of motion vectors are found in DCB andCCB distributions, respectively. Thus, most of the real-world se-quences are usually gentle, smooth, and vary slowly, and can beregarded as stationary or quasi-stationary. Fig. 2 shows the threebiased characteristics of distribution with radius of 2 pels using

. Similarly, about 66.79% of motion vectors are SCBdistributions with a 4-pel radius for CCIR601 using ,as shown in Table IV. In addition, about 61.72% and 55.89% areDCB and CCB MVP distributions, respectively.

As indicated in Table III, four dominant features of MVP dis-tribution can be concluded.

1) Global optimal distribution is SCB within pels,especially the ZMV (0, 0).

2) MVP usually decreases2 away from the ZMP.3) Optimal motion vectors found along the vertical and hor-

izontal directions are often much higher than the other lo-cations with the same radius and regarded as CCB MVP

2When using larger search window�15, MVP is found to grow dramaticallyat the boundary of the search window, as indicated in Table IV. It may be theresult of the final selection of the hard-matched or misdirected seeking of themost likely candidate along the boundary of the spiral block-matching behavior,within the search region. To minimize this violation of feature (2), “smaller than(<)” comparison operator is employed, instead of “smaller than or equal to(�)”, in the replacement process of minima for each subsequent step. Althoughit does not induce any numerical error, it can keep center-biased behavior.

Fig. 2. Over 96% of motion vector distribution possesses CCB characteristicsin the central 5� 5 DCB area.

distribution. For example, there are about 15.71% and7.94% of motion vectors found in vertical and horizontaldirections with a radius 1 pel away from the ZMP. Theyare much higher than the diagonal positions, which totallycontributes about 2.76% at the same radius.

4) Inheriting from (2), the CCB/DCB/SCB MVP distribu-tion characteristics drops gradually away from ZMP.

In this section, three conditional probabilities between dif-ferent biased behavior are defined to find out the most dominantcharacteristics, as follows:

Probability Probability Probability (1)

Page 4: A Novel Cross-Diamond Search Algorithm for Fast Block

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 1171

Probability Probability Probability (2)

Probability Probability Probability (3)

Based on as shown in Table III, the MVP distributionpossesses 94.76% DCB characteristics in the 55 SCB areaand maintains steady over 90% away from the ZMP. Thus, DSprovides promising quality and searching speed as described in[10] and [11]. Besides, if we further investigate the relationshipsbetween the three biased conditional probabilities away from theZMP, we obtain the following inequalities at the critical radiusof pels:

for (4)

for (5)

From (4), within the central 5 5 area, CCB characteristics inthe DCB distribution are found to have even higher probabili-ties than the DCB one in the SCB distribution. This implies thatthere is still a place for further optimization with different searchpatterns, such as a CSP over the diamond-shaped one. This cen-tral CSP with radius pels could work more efficiently onfinding small motion vectors than the diamond-shaped one. Incontrast, the central diamond-shaped pattern with radius beyond

pels works more efficiently on larger motion vectors,as indicated by (5). On the other hand, both CCB characteris-tics in either the DCB or SCB distribution move in a smoothlydecreasing manner away from the ZMP. As alwaysgives about 5% more than for the central 5 5 area,it implies the real-world sequences possess higher CCB motionvector distribution characteristics in the central DCB area ratherthan the SCB one. Thus, the CSP with pels as shownin Fig. 3(a) is proposed to further optimize the diamond-shapedpattern of DS, instead of other BMAs with rectangular searchpatterns. For example, CCB behavior achieves over 96% in theDCB distribution, while it only gives about 91% in the SCB dis-tribution, with pels. Similar features and CCB distribu-tion characteristics can also be observed in Table IV. Over 98%of motion vectors are found CCB in the DCB distribution witha radius of 2 pels and are much higher than that in SCB area(90.24%). In Fig. 1, the CCB motion vector distribution char-acteristics are evidently seen besides either the DCB or SCBones. The CCB distribution is observed as the most dominantcenter-biased behavior.

III. CDS ALGORITHM

A. Cross-Diamond Searching Patterns

The DS algorithm uses a large diamond-shaped pattern(LDSP) and small diamond-shaped pattern (SDSP), as depictedin Fig. 3(b). As the motion vectors distribution possesses over96% CCB characteristics in the central 55 DCB area, aninitial CSP, as shown in Fig. 3(a), is proposed as the initial stepto the DS algorithm, and is termed the CDS algorithm.

B. The CDS Algorithm

CDS differs from DS by: 1) performing a CCB CSP in thefirst step and 2) employing a halfway-stop technique for quasi-

(a)

(b)

Fig. 3. Search patterns used in the CDS algorithm. (a) CSP. (b) LDSP andSDSP.

stationary or stationary candidate blocks. Below summarizes theCDS algorithm.

Step (i)—Starting: A minimum BDM is found from the ninesearch points of the CSP located at the center of search window.If the minimum BDM point occurs at the center of the CSP,the search stops. [This is called the first-step-stop as shown inFig. 5(a).] Otherwise, go to Step (ii).

Step (ii)—Half-diamond Searching:Two additional searchpoints of the central LDSP closest to the current minimum of thecentral CSP are checked, i.e., two of the four candidate points lo-cated at . If the minimum BDM found in previous steplocated at the middle wing of the CSP, i.e., or ,and the new minimum BDM found in this step still coincideswith this point, the search stops. (This is called the second-stepstop, e.g., Fig. 5(b)). Otherwise, go to Step (iii).

Step (iii)—Searching: A new LDSP is formed by reposi-tioning the minimum BDM found in previous step as the centerof the LDSP. If the new minimum BDM point is still at the centerof the newly formed LDSP, then go to Step (iv) (Ending); oth-erwise, this step is repeated again.

Step (iv)—Ending: With the minimum BDM point in theprevious step as the center, a new SDSP is formed. Identify the

Page 5: A Novel Cross-Diamond Search Algorithm for Fast Block

1172 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002

Fig. 4. Flowchart of the CDS algorithm.

new minimum BDM point from the four new candidate points,3

which is the final solution for the motion vector.The flowchart of the CDS algorithm is shown in Fig. 4. Four

typical examples are shown in Fig. 5.

C. Analysis of CDS Algorithm

Fig. 5(a) and 5(b) show two halfway-stop examples, in whichthe CDS algorithm checks only nine (first-step-stop) and 11(second-step-stop) search points, respectively. They lead to thetheoretical speedup of 25 and 20.5 times, respectively, as com-pared to the 225 checking points used in the FS algorithm. Withthe introduction of the CCB motion-vector distribution charac-teristics, it can facilitate the optimization of CDS to DS andreduces computations significantly, especially for low bit-ratevideo applications by tackling image sequences with: 1) gentleor no motion, such as background information, that is reason-ably handled by first-step-stop and 2) small motion, in which amore accurate estimation is accomplished by the second-step-stop. On the other hand, it is obvious that the only case wherethe CDS algorithm performs as the DS algorithm when the min-imum BDM point is neither fallen on the center nor any pointson the middle wing of the CSP. CDS usually keeps advancingbetween successive LDSP by three or five new points, as in theDS algorithm, after it proceeds to the third or fourth step, if nec-essary. In the beginning of the CDS, two special cases with dif-ferent number of new points when using LDSP are, as comparedto the DS algorithm.

1) Only two new points closest to the minimum BDM foundin the first step are used to form half of the central LDSP.For example, the motion vector found in second-step-stopas shown in Fig. 5(b).

2) CDS uses four new points to form LDSP for completingthe third step if the minimum BDM is found not located atthe CSP center in the first step and then one of the two newpoints on diamond-face of the half LDSP in the secondstep, as shown in Fig. 5(c) and 5(d).

Fig. 6(a) shows the maximum number of search pointssaved by the CDS algorithm within the central region ascompared to the DS algorithm. It observed a savings as high as

3Some of the new points have been already checked in previous steps. Thisfact can be applied to any subsequent steps.

four search points per block at the center position (0, 0) and onesearch point at and positions. However, extrasearch points are required outside the central 33 cross-shapedarea. For example, one more search point is required as shown inthe larger diamond-shaped region (excluding the central smallcross-shaped region), as well as two and three search points re-quired for the orthogonal and diagonal directions, respectively.Although the DS algorithm seems to be more efficient beyondthe central 3 3 cross-shaped region, both first-step-stop andsecond-step-stop halfway techniques employed in the CDS al-gorithm can optimize the highly probable CCB characteristics.The statistical average gain in number of search point per blockof the CDS over the DS algorithm will be quantified as follows:

Gain in (6)

where is the square occurrence with a radius ofandis the average number of search points per position saved in thissquare region. We assume the MVP distribution is highly SCBas and

for . Then, from (6), we have

Square-center-biased gain in

search point per block

In Table III, as over 96% of CCB motion vectors are en-closed in the central 3 3 SCB area, it can also be regarded as100% identical between the small CSP and DSP with a 1-pel ra-dius. Such observation implies that real-world sequences havehigh probabilities around and and should resultin higher average gain of than 1.22. Using the inTable III, it theoretically saves an average of 1.36 search pointsper block, as follows:

gain in

search points per block

Page 6: A Novel Cross-Diamond Search Algorithm for Fast Block

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 1173

Fig. 5. Examples of CDS: each candidate point is marked with thecorresponding step number, in which only one is found to be the minimumBDM point (filled). (a) First-step-stop with MV(0,0). (b) Second-step-stopwith MV(�1; 0). Unrestricted search path of CDS for (c)MV(�5;+2) and(d) (+2;�4), respectively. In (d), the best-matched point at step 6 coincideswith that at steps 5 and 4.

Similarly, the maximum number of search points saved by CDSover 4SS is shown in Fig. 6(b). Although 4SS seems to performmore efficiently beyond the central 33 area, the theoreticalaverage gain of over 4SS in the central 33 area is about4.06 search points per block.4

IV. EXPERIMENTAL RESULTS

The proposed CDS algorithm is simulated using the lumi-nance of the popular video sequences listed in Table I. They

4Gain of N over 4SS = 0:454366(8) + 0:264141(24=8) +0:09520(28=16) + 0:077610(�32=24) + 0:032712(�20=32) +0:022562(�120=40) + 0:022259(�264=48) + 0:026841(�452=56) =4:06 search points per block.

(a)

(b)

Fig. 6. Maximum number of search points saved by CDS as compared to(a) DS and (b) 4SS.

consist of different degrees and types of motion content. Due tospace limitation, we only present the five representatives amongthe nine sets of vigorous simulations. The first two sequencesare the “Miss America” and “Sales” in CIF format. The nexttwo are the “Garden” and “Football” in SIF format and thelast one is “Tennis” in CCIR601. In all simulations, the sumabsolute difference (SAD) as the BDM, block size of 16, andsearch window (SIF) (CCIR601) are used. ForCCIR601, half-pel accurate motion estimation is used for allBMA.

The proposed CDS algorithm is compared against five tra-ditional BMAs: FS, 3SS, 4SS, N3SS, and DS, in four aspects.They are: 1) average number of search points per blockand its speedup ratio with respect to the FS; 2) average MADper pixel; 3) average distance from the true motion vector perblock; and 4) probability of finding the true motion vector per

Page 7: A Novel Cross-Diamond Search Algorithm for Fast Block

1174 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002

TABLE VPERFORMANCECOMPARISON OFCDS

block. The “true” motion vectors are regarded as those foundin FS. The first two aspects provide the prediction quality andsearching speed improvement. The last two show how far fromand the percentage of finding the true motion vectors, but theyare independent of the first two aspects. That means that a mo-tion vector far away from the optimal could even give betterquality within the search area. As the frame sizes of CCIR601sequences are four times that of SIF/CIF, motion displacementsbecome larger and result in lower performance in the last twoaspects. The following two parts will present the performancecomparison using SIF/CIF format without explicit notification.It is noted that similar performance using CDS can be found inCCIR601s.

A. Experimental Results on CDS

Table V compares the above four aspects among differentBMAs. It shows the proposed CDS algorithm always con-sumes the smallest number of search points with smaller orslightly higher in MAD as compared to other fast BMAs.As compared to DS, CDS saves about 1.74–4.73 searchpoints per block, which is much more than the theoretical

estimation5 of 1.36 obtained in previous section. Whencompared to 4SS, it also saves about 3.71–6.71 search pointsper block. The average per block with the observations,

, is manifested forsequences with . With such speed improvement, theexperimental results using the first three sequences show thatthe CDS algorithm still achieves smaller distortion error thanthat of 3SS, 4SS, and DS in terms of MAD per pixel while itkeeps comparable results against N3SS. Similarly, CDS usuallyperforms better than 3SS, 4SS, and DS in terms of averagedistance from and probability of finding the true motion vector,as well as comparable against the N3SS. For the CCIR601sequence “Tennis”, the CDS algorithm saves about 1.73, andeven up to 11.68, search points per block, as compared to DSand 4SS, respectively. In addition, it gives smaller distortionerror, smaller distance, and higher probability than that of 3SS,4SS, and N3SS, and provides similar results to that of DS. Thisimplies the CCB characteristics of CDS with unrestricted-stepsfeature outperforms the SCB N3SS, especially when using alarger search window on sequences with larger dimensions, inwhich candidate blocks may probably possess different motioninformation from different objects inside the search window.Thus, CDS is more robust than other BMAs.

The average speed improvement rate (SIR) and averageMAD changed in percentage of the proposed CDS over DSare tabulated in Table VI. For video conferencing sequenceslike “Miss America” and “Sales”, which are highly CCBaround the ZMP, the CDS algorithm achieves 37%–40% speedimprovement over DS. For a relatively higher degree of motionsequences with similar dimensions, “Garden” and “Football”,it gives about 12%–25% speed improvement. For the CCIR601sequence “Tennis”, about 11% of speed improvement is ac-complished. Besides, there is also improvement in MAD if thesequences contain small motion, for examples, about 3% gain inquality using sequences “Miss America”; otherwise, vigorousmotion content like “Football” or larger sequences with largesearch window could reasonably introduce a slight degradationin quality with less than 0.5%. Figs. 7(a) and 8(a) plot the av-erage search points per block using sequences “Miss America”and “Garden” respectively on a frame-by-frame basis. Theyclearly manifest the superiority of the proposed CDS algorithmto other BMA in terms of number of search points used.Figs. 7(b) and 8(b) plot the corresponding frame-wise MADperformance comparison among different BMAs for those twosequences. These figures show that the CDS algorithm alwaysgives smaller distortion error in terms of MAD per pixel than3SS, 4SS, and DS, and is comparable to the N3SS.

B. Further Investigation on Search Quality

As the proposed CDS performs block-matching on half ofthe central LDSP when the minimum BDM found is not locatedat the center of CSP in the first step, only three possible direc-tions will be result for the next step, if no second-step-stop ex-ists. The search result may bias to half of the search window.

5It is because truncation of a search window at picture boundaries and trun-cation of searching patterns at window boundaries can save many search pointspractically, regardless of the theoretical estimation, which considers all direc-tions from the ZMP within the search window.

Page 8: A Novel Cross-Diamond Search Algorithm for Fast Block

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 1175

(a)

(b)

Fig. 7. Frame-wise performance comparison between different BMA on CIF sequence “Miss America” by (a) average number of search points and (b) the MAEper pixel.

It may mislead the block-matching or cause a big route backto the proper direction. In this section, a modified full-diamondsearching step is proposed to replace the Step (ii) half-diamondsearching of CDS for an investigation. The four points located

on the diamond-faces, i.e., , of the central LDSP arechecked in order to provide two more possible directions forconsideration on the next step. The modified CDS is termedthe CDS(FD) algorithm. Fig. 9 shows second-step-stop of the

Page 9: A Novel Cross-Diamond Search Algorithm for Fast Block

1176 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002

(a)

(b)

Fig. 8. Frame-wise performance comparison between different BMA on SIF sequence “Garden” by (a) average number of search points and (b) the MAE perpixel.

CDS(FD). The CDS(FD) performs as the DS algorithm whenthe minimum BDM is found located on any points of LDSP inthe first or modified second step if neither first-step-stop nor

second-step-stop exists. Table VII shows the quality improve-ment by using CDS(FD). It shows that the CDS(FD) trades offabout 0.485–1.662 search points per block for less than 0.004

Page 10: A Novel Cross-Diamond Search Algorithm for Fast Block

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 12, NO. 12, DECEMBER 2002 1177

TABLE VIAVERAGE SIR AND AVERAGE MAD CHANGED OF CDS OVER DS

Fig. 9. Second-step halfway-stop of CDS(FD).

TABLE VIIPERFORMANCECOMPARISON OFCDS AND CDS (FD)

MAD per pixel. It again manifests the robustness of the originalproposed CDS, which gives faster searching speed.

V. CONCLUSION

In this paper, the novel CCB characteristics of motion vectordistribution is exploited and compared against the DCB andSCB ones. With the CCB behavior in most of the real-world se-quences, a novel cross-diamond search algorithm is proposed.The proposed algorithm uses a CSP with nine relatively higherprobable search points located in the central 55 area as the ini-tial step, and diamond-shaped patterns as the subsequent steps.It also employs a halfway-stop technique to find small motionvectors with fewer search points. Experimental results showthat the proposed algorithm not only maintains similar or even

smaller distortion error, but also improves the searching speedby up to 40%, as compared to the DS algorithm. The CDS al-gorithm outperforms other fast BMAs and is more robust, andhence it is suitable for a wide range of video applications suchas low-bit-rate videoconferencing.

REFERENCES

[1] Information Technology—Coding of Moving Pictures and AssociatedAudio for Digital Storage Media at up to About 1.5 Mbit/s–Part 2: Video,ISO/IEC 11 172-2 (MPEG-1 Video), 1993.

[2] Information Technology—Generic Coding of Moving Pictures and Asso-ciated Audio Information: Video, ISO/IEC 13 818-2 (MPEG-2 Video),2000.

[3] Information Technology—Coding of Audio Visual Objects - Part 2: Vi-sual, ISO/IEC 14 469-2 (MPEG-4 Visual), 1999.

[4] Video Codec for Audiovisual Services atp� 64 kbits/s, ITU-T Recom-mendation H.261, Mar. 1993.

[5] Video Coding for Low Bit Rate Communication, ITU-T Recommenda-tion H.263, Feb. 1998.

[6] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro, “Motioncompensated interframe coding for video conferencing,” inProc.National Telecommunications Conf., New Orleans, LA, Nov. 1981, pp.G5.3.1–G5.3.5.

[7] R. Li, B. Zeng, and M. L. Liou, “A new three-step search algorithm forblock motion estimation,”IEEE Trans. Circuits Syst. Video Technol.,vol. 4, pp. 438–443, Aug. 1994.

[8] L. M. Po and W. C. Ma, “A novel four-step search algorithm for fastblock motion estimation,”IEEE Trans. Circuits Syst. Video Technol.,vol. 6, pp. 313–317, June 1996.

[9] L. K. Liu and E. Feig, “A block-based gradient descent search algorithmfor block motion estimation in video coding,”IEEE Trans. Circuits Syst.Video Technol., vol. 6, pp. 419–423, Aug. 1996.

[10] J. Y. Tham, S. Ranganath, M. Ranganath, and A. A. Kassim, “A novelunrestricted center-biased diamond search algorithm for block motionestimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp.369–377, Aug. 1998.

[11] S. Zhu and K. K. Ma, “A new diamond search algorithm for fast block-matching motion estimation,”IEEE Trans. Image Processing, vol. 9, pp.287–290, Feb. 2000.

[12] J. R. Jain and A. K. Jain, “Displacement measurement and its applicationin interframe image coding,”IEEE Trans. Commun., vol. COM-29, pp.1799–1808, Dec. 1981.