The transport of the molecules inside cells is a very important topic, especially in Drug Metabolism. The experimental
testing of the new proteins for the transporter molecular function is expensive and inefficient due to the large
amount of new peptides. Therefore, there is a need for cheap and fast theoretical models to predict the transporter proteins.
In the current work, the primary structure of a protein is represented as a molecular Star graph, characterized by a series of
topological indices. The dataset was made up of 2,503 protein chains, out of which 413 have transporter molecular function
and 2,090 have no transporter function. These indices were used as input to several classification techniques to find
the best Quantitative Structure Activity Relationship (QSAR) model that can evaluate the transporter function of a new
protein chain. Among several feature selection techniques, the Support Vector Machine Recursive Feature Elimination allows
us to obtain a classification model based on 20 attributes with a true positive rate of 83% and a false positive rate of
Keywords: QSAR, Star Graph, topological indices, transport protein, Support Vector Machine.
Rights & PermissionsPrintExport