Identification of Cancerlectins By Using Cascade Linear Discriminant Analysis and Optimal g-gap Tripeptide Composition

Author(s): Liangwei Yang, Hui Gao*, Keyu Wu, Haotian Zhang, Changyu Li, Lixia Tang

Journal Name: Current Bioinformatics

Volume 15 , Issue 6 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Background: Lectins are a diverse group of glycoproteins or glycoconjugate proteins that can be extracted from plants, invertebrates and higher animals. Cancerlectins, a kind of lectins, which play a key role in the process of tumor cells interacting with each other and are being employed as therapeutic agents. A full understanding of cancerlectins is significant because it provides a tool for the future direction of cancer therapy.

Objective: To develop an accurate and practically useful timesaving tool to identify cancerlectins. A novel sequence-based method is proposed along with a correlative webserver to access the proposed tool.

Methods: Firstly, protein features were extracted in a newly feature building way termed, g-gap tripeptide composition. After which a proposed cascade linear discriminant analysis (Cascade LDA) is used to alleviate the high dimensional difficulties with the Analysis Of Variance (ANOVA) as a feature importance criterion. Finally, Support Vector Machine (SVM) is used as the classifier to identify cancerlectins.

Results: The proposed method achieved an accuracy of 91.34% with sensitivity of 89.89%, specificity of 92.48% and an 0.8318 Mathew’s correlation coefficient based on only 13 fusion features in jackknife cross validation, the result of which is superior to other published methods in this domain.

Conclusion: In this study, a new method based only on primary structure of protein is proposed and experimental results show that it could be a promising tool to identify cancerlectins. An openaccess webserver is made available in this work to facilitate other related works.

Keywords: Cancerlectin, cascade LDA, g-gap tripeptide composition, SVM, protein, ANOVA.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 11 November, 2020
Page: [528 - 537]
Pages: 10
DOI: 10.2174/1574893614666190730103156
Price: $65

Article Metrics

PDF: 11