Title:Salient Features, Data and Algorithms for microRNA Screening from Plants: A Review on the Gains and Pitfalls of Machine Learning Techniques
VOLUME: 15 ISSUE: 10
Author(s):Garima Ayachit, Inayatullah Shaikh, Himanshu Pandya* and Jayashankar Das*
Affiliation:Gujarat Biotechnology Research Centre, Department of Science and Technology, Government of Gujarat, Gandhinagar, Gujarat 382011, Gujarat State Biotechnology Mission, Department of Science and Technology, Government of Gujarat, Gandhinagar, Gujarat 382011, Department of Botany, Bioinformatics and Climate Change, University School of Sciences, Gujarat University, Navrangpura, Ahmedabad – 380009, Gujarat State Biotechnology Mission, Department of Science and Technology, Government of Gujarat, Gandhinagar, Gujarat 382011
Keywords:microRNA, dataset, algorithm, feature selection, plant genomics, machine learning.
Abstract:The era of big data and high-throughput genomic technology has enabled scientists to
have a clear view of plant genomic profiles. However, it has also led to a massive need for
computational tools and strategies to interpret this data. In this scenario of huge data inflow,
machine learning (ML) approaches are emerging to be the most promising for analysing
heterogeneous and unstructured biological datasets. Extending its application to healthcare and
agriculture, ML approaches are being useful for microRNA (miRNA) screening as well.
Identification of miRNAs is a crucial step towards understanding post-transcriptional gene
regulation and miRNA-related pathology. The use of ML tools is becoming indispensable in
analysing such data and identifying species-specific, non-conserved miRNA. However, these
techniques have their own benefits and lacunas. In this review, we will discuss the current scenario
and pitfalls of ML-based tools for plant miRNA identification and provide some insights into the
important features, the need for deep learning models and direction in which studies are needed.