Background: Finding a region of interest in an image and content-based image analysis has been a challenging task for the last two decades. With the advancement in image processing, computer vision and a huge amount of image data generation, the Content-Based Image Retrieval System (CBIR) has attracted several researchers as a common technique to manage this huge amount of data. It is an approach of searching user interest based on visual information present in an image. The requirement of high computation power and huge memory limits the deployment of the CBIR technique in real-time scenarios.
Objective: In this paper, an advanced deep learning model is applied to the CBIR on facial image data. We designed a deep convolution neural network architecture where activation of the convolution layer is used for feature representation and included max-pooling as a feature reduction technique. Furthermore, our model uses partial feature mapping as image descriptor to incorporate the property that facial image contains repeated information.
Methods: Existing CBIR approaches primarily consider colour, texture and low-level features for mapping and localizing image segments. While deep learning has shown high performance in numerous fields of research, its application in CBIR is still very limited. Human face contains significant information to be used in a content driven task and applicable to various applications of computer vision and multimedia systems. In this research work, a deep learning-based model has been discussed for Content-Based Image Retrieval (CBIR). In CBIR, there are two important things 1) classification and 2) retrieval of image based on similarity. For the classification purpose, a fourconvolution layer model has been proposed. For the calculation of the similarity, Euclidian distance measure has been used between the images.
Results: The proposed model is completely unsupervised, and it is fast and accurate in comparison to other deep learning models applied for CBIR over the facial dataset. The proposed method provided satisfactory results from the experiment, and it outperforms other CNN-based models such as VGG16, Inception V3, ResNet50, and MobileNet. Moreover, the performance of the proposed model has been compared with pre-trained models in terms of accuracy, storage space and inference time.
Conclusion: The experimental analysis over the dataset has shown promising results with more than 90% classification accuracy.