Generic placeholder image

Recent Advances in Electrical & Electronic Engineering

Editor-in-Chief

ISSN (Print): 2352-0965
ISSN (Online): 2352-0973

Research Article

Parallel Full Search Algorithm for Motion Estimation on Graphic Processing Unit

Author(s): Fatma Ezzahra Sayadi*, Marwa Chouchene, Haithem Bahri, Randa Khemiri and Mohamed Atri

Volume 12, Issue 4, 2019

Page: [317 - 323] Pages: 7

DOI: 10.2174/2352096511666180703114137

Price: $65

Abstract

Background: Advances in video compression technology have been driven by everincreasing processing power available in software and hardware.

Methods: The emerging High-Efficiency Video Coding (HEVC) standard aims to provide a doubling in coding efficiency with respect to the H.264/AVC high profile, delivering the same video quality at half the bit rate.

Results: Thus, the results show high computational complexity. In both standards, the motion estimation block presents a significant challenge in clock latency since it consumes more than 40% of the total encoding time. For these reasons, we proposed an optimized implementation of this algorithm on a low-cost NVIDIA GPU developed with CUDA language.

Conclusion: This optimized implementation can provide high-performance video encoder where the speed reaches about 85.

Keywords: Compute Unified Device Architecture (CUDA), full Search algorithm, GPU, HEVC, motion estimation, Nvidia optimization method.

Graphical Abstract
[1]
R. Khemiri, M. Chouchene, H. Bahri, F.E. Sayadi, H. Kibeya, M. Atri, and N. Masmoudi, "Fast SAD Algorithm of HEVC Video Encoder On Two Successive Generations", International Journal of Imaging and Robotics, vol. 17, no. 2, pp. 95-105, 2017.
[2]
R. Khemiri, H. Kibeya, F.E. Sayadi, N. Bahri, M. Atri, and N. Masmoudi, "Optimization of HEVC Motion Estimation exploiting SAD and SSD GPUbased implementation", IET Image Process., 2017.
[http://dx.doi.org/10.1049/iet-ipr.2017.0474]
[3]
M. Chouchene, F.E. Sayadi, Y. Said, M. Atri, and R. Tourki, "Efficient implementation of Sobel edge detectionalgorithm on CPU, GPU and FPGA", Int. J. Advanced Media and Communication, vol. 5, pp. 105-117, 2014.
[4]
F.E. Sayadi, H. Bahri, M. Chouchene, and M. Atri, "Optimization and performance evaluation of graphic processing units for voice processing", J. Algorithm. Comput. Technol., vol. 11, no. 4, pp. 388-394, 2017.
[5]
H. Bahri, and F.E. Sayadi, "khemiri, R.; Chouchene, M.; Atri, M. Image feature extraction algorithm based on CUDA Architecture: Case Study GFD and GCFD", IET Comput. Digit. Tech., vol. 11, no. 4, pp. 125-132, 2017.
[6]
M. Chouchene, F.E. Sayadi, H. Bahri, J. Dubois, J. Miteran, and M. Atri, "Optimized parallel implementation of face detection based on GPU component", Journal of Microprocessors and Microsystems, vol. 39, pp. 393-404, 2015.
[7]
F.E. Sayadi, M. Chouchene, H. Bahri, and M. Atri, "Implementation and Optimization of Full Search Motion Estimation Algorithm on CUDA", International Journal of Imaging and Robotics,IJIR, vol. 18, no. 3, 2018.
[8]
F.E. Sayadi, H. Bahri, M Chouchene, and M. Atri, "Comparison of FPGA and GPU implementations of LPC algorithm for voice processing", Rec. Adv. Elec. Electron. Eng. vol. 11, 2018
[9]
W. Cliff, "GPU optimization fundamentals”, NVIDIA Developer Technology Group 2013", https://www.olcf.ornl.gov/wp- content/uploads/2013/02/GPU_Opt_Fund-CW1.pdf (Accessed September 23, 2017).
[10]
T. Wiegand, G.J. Sullivan, G. Bjontegaard, and A. Luthra, "Overview of the H.264 / AVC Video Coding Standard", IEEE Trans. Circ. Syst. Video Tech., vol. 13, no. 7, pp. 560-576, 2003.
[11]
H. Hang, Y. Chou, and S. Cheng, "Motion estimation for video coding standards", Journal of VLSI Signal Processing Systems, vol. 17, pp. 113-136, 1997.
[12]
J. Limb, and J. Murphy, "Estimating the velocity of moving images in television signals", Comput. Graph. Image Process., vol. 4, pp. 311-327, 1975.
[13]
B. Haskell, "Frame-to-frame coding of television pictures using two-dimensional Fourier transforms", Journal IEEE Transactions on Information Theory, vol. 20, pp. 119-120, 1974.
[14]
L. Dongkyu, S. Donggyu, C. Keeseong, and O. Seoung-Juin, "Fast motion estimation for HEVC on graphics processing unit (GPU)", J. Real-Time Image Process., vol. 12, no. 2, pp. 1-14, 2016.
[15]
Z. Jing, J. Liangbao, and C. Xuehong, "Implementation of parallel full search algorithm for motion estimation on multi-core processors, In:", 2nd International Conference on Next Information Generation Technology (ICNIT). Gyeongju, South Korea, 2011, pp. 31-35.
[16]
J. Zhang, J.F. Nezan, and J.G. Cousin, "Implementation of motion estimation based on heterogeneous parallel computing system with Opencl, In:", Proceeding of the14th IEEE International Conference on High Performance Computing and Communications (HPCC). Liverpool, United Kingdom, Jun 2012.
[17]
Y. Lin, P. Li, C. Chang, C. Wu, Y. Taso, and S. Chien, "Multipass algorithm of motion estimation in video encoding for generic GPU, In:", Proceeding of the IEEE International Conference on Circuits and Systems ISCAS. Island of Kos, Greece, 2006, pp. 4451-4454.
[18]
A. Obukhov, "GPU-accelerated video encoding, In: GPU", Technology Conference 2010 Sessions on Video Processing. San Jose, California, 2010.
[19]
C. Wei-Nien, and H. Hsueh-Ming, "H.264/AVC motion estimation implementation on compute unified device architecture (CUDA), In:", Proceeding of the IEEE International Conference on Multimedia and Expo (ICME). Hannover, Germany, 2008, pp. 697-700.
[20]
M. Kung, O. Au, P. Wong, and C. Liu, "Block based parallel motion estimation using programmable graphic hardware, In:", Proceeding of the International Conference on Audio, Language and Image Processing (ICALIP). Shanghai, China, 2008, pp. 599-603.
[21]
X. Gan, L. Shen, and Z. Wang, "Parallel full search algorithm for motion estimation using CUDA", J. Comput. Aided Des. Comput. Graph., vol. 22, pp. 457-460, 2010.
[22]
M. Eduarda, V. Bruno, D. Cláudio, M. Marilena, Z. Bruno, and B. Sergio, "Parallelization of Full Search Motion Estimation Algorithm for Parallel and Distributed Platforms", Int. J. Parallel Program., vol. 42, pp. 239-264, 2014.
[23]
D.K. Lee, and S.J. Oh, "Variable block size motion estimation implementation on Compute Unified Device Architecture (CUDA), In:", Proceedings of the IEEE International Conference on Consumer Electronics. Las Vegas, NV, USA, 2013, pp. 633-634.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy