검색 전체 메뉴
PDF
맨 위로
OA 학술지
Fast Motion Estimation Based on a Modified Median Operation for Efficient Video Compression
  • 비영리 CC BY-NC
  • 비영리 CC BY-NC
ABSTRACT
Fast Motion Estimation Based on a Modified Median Operation for Efficient Video Compression
KEYWORD
Diamond Search , Fast motion estimation , Median predictor , MVFAST , PMVFAST
  • I. INTRODUCTION

    The block matching algorithm (BMA) is an essential part of many video coding standards, such as MPEG-1, 2, 4, and H.26x, to exploit the temporal correlation and reduce the redundancy that exists between frames of video sequences [1, 2]. In BMA, the current frame is partitioned into nonoverlapped blocks, and the best match for these blocks is found inside the search range of a reference frame with a predefined distortion criterion. The best matching block is then used as a predictor for the block in the current frame, where the displacement between the two blocks defines a motion vector (MV) of the current block. Thus, the encoder sends the MV and a residual block, which is the difference between the current block and the predictor, instead of the current block itself. This can significantly reduce the number of bits involved in video coding procedures. The most common distortion measure is the mean absolute error or mean absolute difference, or equivalently, the sum of the absolute difference (SAD), which requires no multiplication and exhibits a performance similar to that of the mean squared error [3, 4]. Let ft and ft-1 be a current frame and a reference frame, respectively. Then, for an M × N block located at (x, y) in the current frame, the SAD is given by

    where dx and dy denote the displacements in the x- and y-direction, respectively.

    The full search (FS) technique selects the MV, which points to the best matching block that has a minimum SAD, by exhaustive search at all possible points within the search range. Since the calculation of SAD at all possible locations is computationally intensive, it is difficult to directly use FS for various applications, such as digital TV, Internet streaming, and mobile video, in spite of its simple implementation. For this reason, many fast search algorithms have been suggested in an effort to reduce the computational complexity by searching only a subset of the possible search points, such as two-dimensional (2D) log search [5], three-step search (TSS) [6], new three-step search (NTSS) [7], and diamond search (DS) strategies [8, 9]. However, these methods usually incur a significant loss in the PSNR, although they can make the search faster by calculating the SAD less frequently. Moreover, these algorithms are not exactly optimal in consideration of the bits required to encode each MV, since they find MVs just by minimizing the SAD irrespective of the distribution of MVs.

    In particular, the motion vector field adaptive search technique (MVFAST) and the predictive MVFAST (PMVFAST) have been introduced in order to improve the DS technique without sacrificing PSNR [10-12]. The MVFAST considers the MVs of the spatially adjacent blocks and the PMVFAST includes the MVs of the spatially and temporally adjacent blocks as possible candidates. The priority of the initial search point in PMVFAST is given to results of the median operation for the candidates, whereas priority in DS is given to a (0, 0) motion vector, MV0. In addition, the early termination by a thresholding technique in the PMVFAST further reduces the computational complexity of the search procedures. These techniques of PMVFAST allow the video encoders to generate fewer bits for MVs with a negligible PSNR loss as well as significantly reduce the computations for search procedures, since the relatively high correlation between the MVs can be used effectively.

    In this paper, we propose a modified median (MMED) operation and effective early termination conditions for the fast motion estimation. Since the MV of a temporally colocated block is more correlated with the final MV in many cases, our approach uses the MVs of four adjacent blocks, which include the left, top, and top-right blocks of the current block and a temporally co-located block in a reference frame, whereas PMVFAST uses the MVs of three spatially adjacent blocks. To select an appropriate candidate in the initial search procedure, we introduce an MMED operator that can decrease the number of search points, and this reduces the overall computational load. Since the initial search point is chosen more appropriately, thresholds and early termination conditions can be simplified, and consequently, we achieve faster motion estimation without sacrificing PSNR compared with the case of PMVFAST. The rest of this paper is organized as follows: Section II presents a review of MVFAST and PMVFAST. In Section III, the proposed fast motion estimation technique is discussed in detail. Experimental results and conclusions are given in Sections IV and V, respectively.

    II. REVIEW OF MVFAST AND PMVFAST

    Most of the fast motion estimation algorithms first choose several candidates and then, calculate SAD at the candidate points. After the selection of the candidate with the minimum SAD, more search points around the selected candidate are tested. If the best candidate is not found in this procedure, the current candidate becomes the final MV. Therefore, the efficiency of the fast search algorithms depends on the determination of the candidates. In TSS, the number of search points does not change significantly according to the number of moving objects in the video sequences. On the other hand, in the DS that is biased to MV0, better performance can be observed when the video sequences contain a few moving objects. MVFAST and PMVFAST also use DS patterns for fast motion estimation that is adaptive to the characteristics of video sequences. We will now briefly describe MVFAST and PMVFAST, which improve the search speed significantly with a negligible PSNR loss, since the proposed method is based on them.

      >  A. Search Procedures of MVFAST

    MVFAST introduced in [10, 12] improves the speed-up of the DS algorithm significantly by initially considering a small set of candidates, which includes MV0 and the MVs of three spatially adjacent blocks (the left, top, and top-right blocks). Unlike the DS that uses only the large diamond search pattern (LDSP) shown in Fig. 1(a), MVFAST considers the small diamond search pattern (SDSP) shown in Fig. 1(b).

    Both the diamond patterns (LDSP and SDSP) keep on moving until the minimum SAD is found at the center. If the center point is selected using SDSP, the search procedure is terminated at this point. If the center point is selected using LDSP, the search procedure is carried out again using SDSP around the center of LDSP.

    In MVFAST, the selection between SDSP and LDSP depends on the motion activity that the current block will most likely have. This motion activity is characterized by the magnitude of the largest MV among the three adjacent MVs, i.e., MVleft, MVtop, and MVtop-right, as shown in Fig. 2.

    The motion activity is described as low when it is below a threshold L1, medium when it is between L1 and a second threshold L2, and high when it is larger than L2. If the motion activity is considered low, SDSP is applied with MV0 as the center. If the motion activity is medium, MV0 is selected again as the center and LDSP is applied at this time instead of SDSP. This is the only case in which LDSP is selected, whereas the DS algorithm always chooses LDSP. After testing with LDSP, we apply SDSP again at the point with the smallest SAD as the center. Finally, if the motion activity is characterized as high, the center is determined with the position of the MV that yields the minimum SAD among MV0 and the spatially adjacent MVs, and then, SDSP is applied at this point. Moreover, in order to increase the search speed, MVFAST can introduce an early termination step, which is performed initially in the algorithm, with a fixed threshold (e.g., T = 512) at the MV0 position. When the SAD at the MV0 position is lower than T, MVFAST is stopped without examining any other locations.

      >  B. Search Procedures of PMVFAST

    Unlike MVFAST whose search priority is given to MV0, the priority of PMVFAST is given to the median of three spatially adjacent MVs [11, 12]. If SAD at the point obtained by the median operation is sufficiently small (e.g., T = 256), the search is terminated early after examining just one point. Otherwise, another termination condition is considered as follows: If the MV selected by the median operation is equal to the MV of a temporally co-located block (MVt-1) and SAD at this point is smaller than SADt-1, which is SAD between the block in ft-1 and the block pointed by MVt-1 in ft-2, the search is stopped. When the algorithm is not terminated after checking the median operation, the final MV is searched by examining the five candidates, MV0, MVleft, MVtop, MVtop-right, and MVt-1. After the selection of the best candidate from among them, a threshold T1 is applied; that is, the search is stopped if SAD calculated by the best candidate is smaller than T1, which is selected adaptively as the smallest SAD among the SADs of the left, top, and top-right blocks. Otherwise, the best candidate is used as the center of the DS pattern. Furthermore, if the best candidate and MVt-1 are identical and SAD at the best candidate is smaller than SADt-1, the search is terminated.

    In PMVFAST, LDSP is selected less frequently to reduce the total number of search points. To determine which DS pattern should be used, another threshold T2, defined as T1 + 256 is applied. LDSP is chosen when T2 is larger than 1536 and the result of the median operation is MV0, whereas SDSP is used in all other cases. When the smallest SAD is obtained at the center of the pattern, the algorithm is terminated. Additionally, PMVFAST has another early termination condition as follows: If all of the MVs of three spatially adjacent blocks are identical and MVt-1 is equal to the result of the median operation, the candidates (three spatially adjacent MVs and MV0) are highly correlated with each other. In this case, the selected search pattern is applied just once. Fig. 3 shows an example of searching the best matching block where the square shapes represent the points that have the smallest SAD in each step.

    III. PROPOSED MOTION ESTIMATION

    We introduce a MMED operation to predict the appropriate starting search point, which significantly affects the performance of a fast motion estimation technique. Furthermore, the proposed algorithm reduces the number of thresholds and early termination conditions, and it consequently prevents the degradation in PSNR.

      >  A. Modified Median Operation

    PMVFAST feeds three spatially adjacent MVs to the median operator in order to determine the initial search point. However, our investigation presents that a temporally co-located MV, MVt-1 shown in Fig. 4, can be a good candidate for the starting search point, since it is highly correlated with the final MV

    Table 1 describes the average distances in pixel values between each candidate MV and the final MV as obtained by using PMVFAST. We can infer from the table that MVt-1 is more highly correlated to the final MV than the others in most of the test sequences. Therefore, we include MVt-1 in the median operation to determine the starting search point. Let (px, py) be a starting search point; then, the MMED operation is given by

    [Table 1.] Average distances in pixels between each candidate and the final MV in the case of PMVFAST

    label

    Average distances in pixels between each candidate and the final MV in the case of PMVFAST

    where i ∈ {left, top, top-right, t-1}, i.e., denotes the x-component of the starting search point selected among the MVs of the left, top, top-right, and temporally co-located blocks.

    The proposed MMED operation can deal with evennumbered elements, whereas the conventional median operation used in PMVFAST can process only odd-numbered elements. A median operation eliminates the uncorrelated values more effectively than an average operation. On the other hand, the proposed MMED operation reduces the prediction error by averaging the spatially and temporally adjacent MVs and excluding the maximum and minimum values among them, which are considered to be less correlated to the final MV.

    The starting search point for the current block at the typical position is determined as follows: When the current block is located at the origin (i.e., top-leftmost position) of the frame, MVt-1 is chosen directly as the predicted MV, since it is the only available candidate MV. When the current block lies in the first row except at the origin of the frame, two candidates, MVleft and MVt-1, are available. Since the proposed MMED operator cannot deal with two elements, MV0 is added in this case. When the current block lies in the first column, except at the origin of the frame, or in the rightmost column, three candidates are available (MVtop, MVtop-right, and MVt-1 for the first column, or MVleft, MVtop, and MVt-1 for the rightmost column) and the starting search point is selected by the median rule instead of (2).

    After the starting search point is determined, the early termination condition is applied. That is, if SAD at the point from (2) is less than T (e.g., 256), the search procedure is terminated. Furthermore, when the point from (2) is equal to MVt-1 and SAD at this point is smaller than SADt-1, the search procedure is also stopped. Table 2, which describes the frequency of the final MV determined after only one point is examined, presents that the MV from the proposed MMED operation is selected as the final MV more frequently than in the case of PMVFAST. This implies that the proposed algorithm can select the best matching block faster with a lower PSNR loss than PMVFAST.

    [Table 2.] The number of final MVs after the examination of only one point

    label

    The number of final MVs after the examination of only one point

      >  B. Determination of the Final Motion Vector

    PMVFAST examines five candidates including MV0 after the selection of the starting search point by the median operation. In fact, as shown in Table 3, MV0 is rarely selected as the final MV when all the other candidates are not equal to MV0. This implies that there is at least one MV0 among the spatially and temporally adjacent MVs when MV0 is selected as the final MV in most cases. Thus, MV0 does not need to be explicitly included in the search procedure, since it is possibly included in the spatially and temporally adjacent MVs.

    [Table 3.] Frequency and ratio that MV0 is selected when all of the spatially and temporally adjacent MVs are not equal to MV0

    label

    Frequency and ratio that MV0 is selected when all of the spatially and temporally adjacent MVs are not equal to MV0

    Consequently, the proposed algorithm includes just four candidates of MVleft, MVtop, MVtop-right, and MVt-1 in the search procedure for the final MV. Since MV0 is not tested, the total number of search points is reduced and the correlation between MVs can increase. After an examination of these four candidates, the point that has the smallest SAD is selected as the center of SDSP. This further reduces the number of search points, since the proposed algorithm does not use LDSP. When the MVs are more correlated with each other in a video frame, the number of bits needed to encode each MV is reduced, since the video encoders usually encode the difference between the current MV and the adjacent MVs. We apply an early termination condition to the selected MV in order to further increase the correlation between the MVs as follows: A threshold T1, which is limited between 512 and 1024, is set to the smallest SAD among the three spatially adjacent MVs (MVleft, MVtop, and MVtop-right). If SAD at the selected MV is smaller than T1, the search procedure is stopped. When the selected MV is equal to MVt-1 and SAD at this point is smaller than SADt-1, the search is also terminated. For the other cases, the proposed search procedures apply SDSP to find the smallest SAD. In this case, the thresholds and early termination conditions are no longer used.

    IV. EXPERIMENTAL RESULTS

    The performance of the proposed method was evaluated on the basis of [13] by using the MPEG-4 verification model encoder [14]. Test sequences of QCIF (176 × 144), CIF (352 × 288), and SIF (352 × 240) were coded as IPPP; that is, all frames were inter-coded and only the first frame was intra-coded. TM5 was adopted as the rate control algorithm. Thresholds for motion activity in MVFAST, L1 and L2 were set to 1 and 2, respectively, and the threshold for early termination in PMVFAST and the proposed method, T was set to 256. The maximum displacements of the search range were set to ±16, ±32, and ±48 pixels, respectively, in both the horizontal and the vertical direction.

    Table 4 summarizes the performance of the proposed algorithm compared with MVFAST and PMVFAST in terms of both the PSNR and the speed-up ratio on average. The speed-up ratio is obtained by comparing the number of search points of each method with that of FS. Detailed results of the proposed approach and conventional methods for each experimental condition are described in Table 5.

    [Table 4.] Results of the proposed algorithm in terms of average PSNR and speed-up ratio

    label

    Results of the proposed algorithm in terms of average PSNR and speed-up ratio

    It can be inferred from Tables 4 and 5 that the proposed algorithm significantly reduces the number of search points compared with those of conventional methods and exhibits similar or higher PSNRs than those of the FS. When the correlation between adjacent MVs increases, the video encoder can reduce the number of bits needed to encode each MV, and thus, the extra bits can be assigned for increasing PSNR. Fig. 5 shows the distribution of MV fields for the Foreman sequence from the FS and the proposed method. As shown in Fig. 5, the final MVs of the proposed algorithm are distributed more uniformly than those of the FS.

    The proposed scheme is superior to PMVFAST, which is a widely used technique for fast motion estimation, in terms of both the speed-up ratio and the PSNR, as shown in Table 5. The performance of PMVFAST can be limited, since many thresholds without considering the characteristics of MVs possibly search for inappropriate points. The proposed algorithm searches faster than conventional algorithms and its average PSNRs are higher than those of the FS.

    [Table 5.] Experimental results of the proposed algorithm compared with the conventional methods

    label

    Experimental results of the proposed algorithm compared with the conventional methods

    V. CONCLUSIONS

    This paper presents a fast motion estimation method based on a MMED operation and advanced strategies for the selection of the search points. The proposed MMED operation includes the temporally co-located MV for an initial candidate, which is more correlated with the final MV, as observed in our experiments. Moreover, we reduce the number of thresholds and simplify the early termination conditions. With these improvements, the proposed approach achieves a faster search than MVFAST and PMVFAST. The proposed algorithm also improves PSNRs compared with those of the FS, since the MVs obtained from our approach are more correlated with each other and the bit-savings in coding them can be used for improvements in PSNR. Based on the better performance in terms of fast search and PSNR, the proposed algorithm can be employed in many applications, such as digital TV, Internet streaming, and mobile video by simply modifying the existing algorithms.

참고문헌
  • 1. Ghanbari M. 2003 “Principles of video compression,” in Standard Codecs: Image Compression to Advanced Video Coding P.25-61 google
  • 2. Richardson I. E. 2003 H.264 and MPEG-4 Video Compression google
  • 3. Kim J. S., Kim J. G. 2011 “A performance comparison of blockbased matching cost evaluation models for FRUC techniques,” [International Journal of Information and Communication Engineering] Vol.9 P.671-675 google
  • 4. Sorwar G., Murshed M., Dooley L. 2004 “Fast block-based true motion estimation using distance dependent thresholds,” [Journal of Research and Practice in Information Technology] Vol.36 P.157-169 google
  • 5. Jain J. R., Jain A. K. 1981 “Displacement measurement and its application in interframe image coding,” [IEEE Transactions on Communications] Vol.29 P.1799-1808 google
  • 6. Koga T., Hirano K., Iijima Y., Ishiguro T. 1981 “Motioncompensated interframe coding for video conferencing,” [in Proceedings of the National Telecommunications Conference] P.G5.3.1-G5.3.5 google
  • 7. Li R., Zeng B., Liou M. L. 1994 “A new three-step search algorithm for block motion estimation,” [IEEE Transactions on Circuits and Systems for Video Technology] Vol.4 P.438-442 google
  • 8. Zhu S., Ma K. K. 2000 “A new diamond search algorithm for fast block matching motion estimation,” [IEEE Transactions on Image Processing] Vol.9 P.287-290 google
  • 9. Tham J. Y., Ranganath S., Ranganath M., Kassim A. A. 1998 “A novel unrestricted center-based diamond search algorithm for block motion estimation,” [IEEE Transactions on Circuits and Systems for Video Technology] Vol.8 P.369-377 google
  • 10. Hosur P. I., Ma K. K. 1999 “Motion vector field adaptive fast motion estimation,” [in Proceedings of the 2nd International Conference on Information, Communications and Signal Processing] P.7-10 google
  • 11. Tourapis A. M., Au O. C., Liou M. L. 2001 “Predictive motion vector field adaptive search technique (PMVFAST): enhancing block-based motion estimation,” [in Proceedings of the Visual Communication and Image Processing] P.883-892 google
  • 12. 2001 “Optimized visual reference software,” Text of 14496-7 PDTR, ISO/IEC N4057 google
  • 13. 1999 “Experimental conditions for evaluating encoder motion estimation algorithms,” ISO/IEC JTC1/SC29/WG11 N3141 google
  • 14. 2001 “The MPEG-4 video verification model v.18.0,” ISO/IEC JTC1/SC29/WG11 N3908 google
이미지 / 테이블
  • [ Fig. 1. ]  Diamond search patterns: (a) large diamond search pattern (LDSP) and (b) small diamond search pattern (SDSP).
    Diamond search patterns: (a) large diamond search pattern (LDSP) and (b) small diamond search pattern (SDSP).
  • [ Fig. 2. ]  Motion vectors of spatially adjacent blocks. MV: motion vector.
    Motion vectors of spatially adjacent blocks. MV: motion vector.
  • [ Fig. 3. ]  An example of the search procedure for the best matching block in predictive motion vector field adaptive search technique. MV: motion vector.
    An example of the search procedure for the best matching block in predictive motion vector field adaptive search technique. MV: motion vector.
  • [ Fig. 4. ]  Temporally co-located motion vector, where ft and ft-1 represent the current frame and the reference frame, respectively. MV: motion vector.
    Temporally co-located motion vector, where ft and ft-1 represent the current frame and the reference frame, respectively. MV: motion vector.
  • [ Table 1. ]  Average distances in pixels between each candidate and the final MV in the case of PMVFAST
    Average distances in pixels between each candidate and the final MV in the case of PMVFAST
  • [ Table 2. ]  The number of final MVs after the examination of only one point
    The number of final MVs after the examination of only one point
  • [ Table 3. ]  Frequency and ratio that MV0 is selected when all of the spatially and temporally adjacent MVs are not equal to MV0
    Frequency and ratio that MV0 is selected when all of the spatially and temporally adjacent MVs are not equal to MV0
  • [ Table 4. ]  Results of the proposed algorithm in terms of average PSNR and speed-up ratio
    Results of the proposed algorithm in terms of average PSNR and speed-up ratio
  • [ Fig. 5. ]  Distribution of motion vector fields of the Foreman sequence for each search method: (a) full search and (b) proposed algorithm.
    Distribution of motion vector fields of the Foreman sequence for each search method: (a) full search and (b) proposed algorithm.
  • [ Table 5. ]  Experimental results of the proposed algorithm compared with the conventional methods
    Experimental results of the proposed algorithm compared with the conventional methods
(우)06579 서울시 서초구 반포대로 201(반포동)
Tel. 02-537-6389 | Fax. 02-590-0571 | 문의 : oak2014@korea.kr
Copyright(c) National Library of Korea. All rights reserved.