Parallel Full Search Algorithm for Motion Estimation on Graphic Processing Unit

Author(s): Fatma Ezzahra Sayadi* , Marwa Chouchene , Haithem Bahri , Randa Khemiri , Mohamed Atri .

Journal Name: Recent Advances in Electrical & Electronic Engineering

Volume 12 , Issue 4 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Background: Advances in video compression technology have been driven by everincreasing processing power available in software and hardware.

Methods: The emerging High-Efficiency Video Coding (HEVC) standard aims to provide a doubling in coding efficiency with respect to the H.264/AVC high profile, delivering the same video quality at half the bit rate.

Results: Thus, the results show high computational complexity. In both standards, the motion estimation block presents a significant challenge in clock latency since it consumes more than 40% of the total encoding time. For these reasons, we proposed an optimized implementation of this algorithm on a low-cost NVIDIA GPU developed with CUDA language.

Conclusion: This optimized implementation can provide high-performance video encoder where the speed reaches about 85.

Keywords: Compute Unified Device Architecture (CUDA), full Search algorithm, GPU, HEVC, motion estimation, Nvidia optimization method.

R. Khemiri, M. Chouchene, H. Bahri, F.E. Sayadi, H. Kibeya, M. Atri, and N. Masmoudi, "Fast SAD Algorithm of HEVC Video Encoder On Two Successive Generations", International Journal of Imaging and Robotics, vol. 17, no. 2, pp. 95-105, 2017.
R. Khemiri, H. Kibeya, F.E. Sayadi, N. Bahri, M. Atri, and N. Masmoudi, "Optimization of HEVC Motion Estimation exploiting SAD and SSD GPUbased implementation", IET Image Process., 2017.
M. Chouchene, F.E. Sayadi, Y. Said, M. Atri, and R. Tourki, "Efficient implementation of Sobel edge detectionalgorithm on CPU, GPU and FPGA", Int. J. Advanced Media and Communication, vol. 5, pp. 105-117, 2014.
F.E. Sayadi, H. Bahri, M. Chouchene, and M. Atri, "Optimization and performance evaluation of graphic processing units for voice processing", J. Algorithm. Comput. Technol., vol. 11, no. 4, pp. 388-394, 2017.
H. Bahri, and F.E. Sayadi, "khemiri, R.; Chouchene, M.; Atri, M. Image feature extraction algorithm based on CUDA Architecture: Case Study GFD and GCFD", IET Comput. Digit. Tech., vol. 11, no. 4, pp. 125-132, 2017.
M. Chouchene, F.E. Sayadi, H. Bahri, J. Dubois, J. Miteran, and M. Atri, "Optimized parallel implementation of face detection based on GPU component", Journal of Microprocessors and Microsystems, vol. 39, pp. 393-404, 2015.
F.E. Sayadi, M. Chouchene, H. Bahri, and M. Atri, "Implementation and Optimization of Full Search Motion Estimation Algorithm on CUDA", International Journal of Imaging and Robotics,IJIR, vol. 18, no. 3, 2018.
F.E. Sayadi, H. Bahri, M Chouchene, and M. Atri, "Comparison of FPGA and GPU implementations of LPC algorithm for voice processing", Rec. Adv. Elec. Electron. Eng. vol. 11, 2018
W. Cliff, "GPU optimization fundamentals”, NVIDIA Developer Technology Group 2013", content/uploads/2013/02/GPU_Opt_Fund-CW1.pdf (Accessed September 23, 2017).
T. Wiegand, G.J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264 / AVC Video Coding Standard", IEEE Trans. Circ. Syst. Video Tech., vol. 13, no. 7, pp. 560-576, 2003.
H. Hang, Y. Chou, and S. Cheng, "Motion estimation for video coding standards", Journal of VLSI Signal Processing Systems, vol. 17, pp. 113-136, 1997.
J. Limb, and J. Murphy, "Estimating the velocity of moving images in television signals", Comput. Graph. Image Process., vol. 4, pp. 311-327, 1975.
B. Haskell, "Frame-to-frame coding of television pictures using two-dimensional Fourier transforms", Journal IEEE Transactions on Information Theory, vol. 20, pp. 119-120, 1974.
L. Dongkyu, S. Donggyu, C. Keeseong, and O. Seoung-Juin, "Fast motion estimation for HEVC on graphics processing unit (GPU)", J. Real-Time Image Process., vol. 12, no. 2, pp. 1-14, 2016.
Z. Jing, J. Liangbao, and C. Xuehong, "Implementation of parallel full search algorithm for motion estimation on multi-core processors, In:", 2nd International Conference on Next Information Generation Technology (ICNIT). Gyeongju, South Korea, 2011, pp. 31-35.
J. Zhang, J.F. Nezan, and J.G. Cousin, "Implementation of motion estimation based on heterogeneous parallel computing system with Opencl, In:", Proceeding of the14th IEEE International Conference on High Performance Computing and Communications (HPCC). Liverpool, United Kingdom, Jun 2012.
Y. Lin, P. Li, C. Chang, C. Wu, Y. Taso, and S. Chien, "Multipass algorithm of motion estimation in video encoding for generic GPU, In:", Proceeding of the IEEE International Conference on Circuits and Systems ISCAS. Island of Kos, Greece, 2006, pp. 4451-4454.
A. Obukhov, "GPU-accelerated video encoding, In: GPU", Technology Conference 2010 Sessions on Video Processing. San Jose, California, 2010.
C. Wei-Nien, and H. Hsueh-Ming, "H.264/AVC motion estimation implementation on compute unified device architecture (CUDA), In:", Proceeding of the IEEE International Conference on Multimedia and Expo (ICME). Hannover, Germany, 2008, pp. 697-700.
M. Kung, O. Au, P. Wong, and C. Liu, "Block based parallel motion estimation using programmable graphic hardware, In:", Proceeding of the International Conference on Audio, Language and Image Processing (ICALIP). Shanghai, China, 2008, pp. 599-603.
X. Gan, L. Shen, and Z. Wang, "Parallel full search algorithm for motion estimation using CUDA", J. Comput. Aided Des. Comput. Graph., vol. 22, pp. 457-460, 2010.
M. Eduarda, V. Bruno, D. Cláudio, M. Marilena, Z. Bruno, and B. Sergio, "Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms", Int. J. Parallel Program., vol. 42, pp. 239-264, 2014.
D.K. Lee, and S.J. Oh, "Variable block size motion estimation implementation on Compute Unified Device Architecture (CUDA), In:", Proceedings of the IEEE International Conference on Consumer Electronics. Las Vegas, NV, USA, 2013, pp. 633-634.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [317 - 323]
Pages: 7
DOI: 10.2174/2352096511666180703114137
Price: $58

Article Metrics

PDF: 12