Multi-Timescale Gated Neural Network for Video Recognition

Author(s): Liu Cong*, Ma Longhua, Liu Feng.

Journal Name: Recent Patents on Computer Science

Volume 10 , Issue 1 , 2017

Become EABM
Become Reviewer


Background: Deep neural network based methods have obtained great progress in a variety of computer vision tasks, as described in various patents. But, so far, it is still a challenge task to model temporal dependencies in the tasks of recognizing object movement from videos.

Method: In this paper, we propose a multi-timescale gated neural network for encoding the temporal dependencies from videos. The developed model stacks multiple gated layers in a recurrent pyramid, which makes it possible to hierarchically model not just pairs but long-term dependencies from video frames. Additionally, the model combines the Convolutional Neural Networks into its structure that exploits the pictorial nature of the frames and reduces the number of model parameters.

Result: We evaluated the proposed model on the datasets of synthetic bouncing-MNIST, standard actions benchmark of UCF101 and facial expressions benchmark of CK+. The experiment results reveal that on all tasks, the proposed model outperforms the existing approach to build deep stacked gated model and achieves superior performance compared to several recent state-of-the-art techniques.

Conclusion: From the experimental results, we can make the conclusion that our proposed model is able to adapt its structure based on different time scales and can be applied in motion estimation, action recognition and tracking, etc.

Keywords: Deep learning, neural networks, multiplicative interactions, optimization, model temporal dependencies.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2017
Page: [96 - 103]
Pages: 8
DOI: 10.2174/2213275910666170502144924
Price: $58

Article Metrics

PDF: 7