Generic placeholder image

Letters in Organic Chemistry

Editor-in-Chief

ISSN (Print): 1570-1786
ISSN (Online): 1875-6255

Research Article

Protein Structural Class Prediction Based on Distance-related Statistical Features from Graphical Representation of Predicted Secondary Structure

Author(s): Liang Kong*, Lichao Zhang, Xiaodong Han and Jinfeng Lv

Volume 16, Issue 4, 2019

Page: [317 - 324] Pages: 8

DOI: 10.2174/1570178615666180914110451

Price: $65

Abstract

Protein structural class prediction is beneficial to protein structure and function analysis. Exploring good feature representation is a key step for this prediction task. Prior works have demonstrated the effectiveness of the secondary structure based feature extraction methods especially for lowsimilarity protein sequences. However, the prediction accuracies still remain limited. To explore the potential of secondary structure information, a novel feature extraction method based on a generalized chaos game representation of predicted secondary structure is proposed. Each protein sequence is converted into a 20-dimensional distance-related statistical feature vector to characterize the distribution of secondary structure elements and segments. The feature vectors are then fed into a support vector machine classifier to predict the protein structural class. Our experiments on three widely used lowsimilarity benchmark datasets (25PDB, 1189 and 640) show that the proposed method achieves superior performance to the state-of-the-art methods. It is anticipated that our method could be extended to other graphical representations of protein sequence and be helpful in future protein research.

Keywords: Protein structural class, sequence similarity, secondary protein structure, chaos game representation, support vector machines, DNA.

Graphical Abstract
[1]
Lipkowitz, K.B.; Cundari, T.R.; Gillet, V.J.; Boyd, D.B. Rev. Comput. Chem; Wiley & Sons: Hoboken, 2006.
[2]
Chou, K.C. Curr. Protein Pept. Sci., 2005, 6, 423-436.
[3]
Levitt, M.; Chothia, C. Nature, 1976, 261, 552-558.
[4]
Murzin, A.; Brenner, S.; Hubbard, T.; Chothia, C. J. Mol. Biol., 1995, 357, 536-540.
[5]
Kong, L.; Kong, L.F.; Wang, C.W.; Jing, R.; Zhang, L.C. Lett. Org. Chem., 2017, 14(9), 673-683.
[6]
Kurgan, L.A.; Homaeian, L. Pattern Recognit., 2006, 39, 2323-2343.
[7]
Kong, L.; Zhang, L.C.; Lv, J.F. J. Theor. Biol., 2014, 344, 12-18.
[8]
Liu, T.; Jia, C. J. Theor. Biol., 2010, 267, 272-275.
[9]
Kurgan, L.A.; Cios, K.; Chen, K. BMC Bioinformatics, 2008, 9, 226.
[10]
Ding, S.; Zhang, S.; Li, Y.; Wang, T. Biochimie, 2012, 94, 1166-1171.
[11]
Zhang, L.C.; Zhao, X.Q.; Kong, L. Biochimie, 2013, 95, 1741-1744.
[12]
Dai, Q.; Li, Y.; Liu, X.; Yao, Y.; Cao, Y.; He, P. BMC Bioinformatics, 2013, 14, 152.
[13]
Dehzangi, A.; Paliwal, K.; Lyons, J.; Sharma, A.; Sattar, A. BMC Genomics, 2014, 15, S2.
[14]
Wang, J.; Li, Y.; Liu, X.; Dai, Q.; Yao, Y.; He, P. Biochimie, 2014, 101, 104-112.
[15]
Kong, L.; Zhang, L.C. Genomics, 2014, 103, 292-297.
[16]
Yang, J.Y.; Peng, Z.L.; Yu, Z.G.; Zhang, R.J.; Anh, V.; Wang, D.S. J. Theor. Biol., 2009, 257, 618-626.
[17]
Yang, J.Y.; Peng, Z.L.; Chen, X. BMC Bioinformatics, 2010, 11, S9.
[18]
Olyaee, M.H.; Yaghoubi, A.; Yaghoobi, M. J. Theor. Biol., 2016, 404, 375-382.
[19]
Zhang, L.C.; Kong, L.; Han, X.D.; Lv, J.F. J. Theor. Biol., 2016, 400, 1-10.
[20]
Jones, D.T. J. Mol. Biol., 1999, 292, 195-202.
[21]
Chen, K.; Kurgan, L.A.; Ruan, J. J. Comput. Chem., 2008, 29, 1596-1604.
[22]
Niu, X.; Shi, F.; Hu, X.; Xia, J.; Li, N. Expert Syst. Appl., 2014, 41, 1672-1679.
[23]
Jeffrey, H.J. Nucleic Acids Res., 1990, 18, 2163-2170.
[24]
Basu, S.; Pan, A.; Dutta, C.; Das, J. J. Mol. Graph. Model., 1997, 15, 279-289.
[25]
He, P.A.; Xu, S.; Dai, Q.; Yao, Y. Int. J. Quantum Chem., 2016, 116, 476-482.
[26]
Vapnik, V. Statistical Learning Theory; Wiley-Interscience: New York, 1998.
[27]
Su, Z.D.; Huang, Y.; Zhang, Z.Y.; Zhao, Y.W.; Wang, D.; Chen, W.; Chou, K.C.; Lin, H. Bioinformatics, 2018, 24, 4196-4204.
[28]
Tang, H.; Zhao, Y.W.; Zou, P.; Zhang, C.M.; Chen, R.; Huang, P.; Lin, H. Int. J. Biol. Sci., 2018, 14(8), 957-964.
[29]
Yang, H.; Qiu, W.R.; Liu, G.; Guo, F.B.; Chen, W.; Chou, K.C.; Lin, H. Int. J. Biol. Sci., 2018, 14(8), 883-891.
[30]
Tang, H.; Zhang, C.M.; Chen, R.; Huang, P.; Duan, C.G.; Zou, P. Lett. Org. Chem., 2017, 14(9), 621-624.
[31]
Chen, W.; Yang, H.; Feng, P.M.; Ding, H.; Lin, H. Bioinformatics, 2017, 33(22), 3518-3523.
[32]
Chang, C.C.; Lin, C.J. ACM Trans. Intell. Syst. Technol., 2011, 2(3), 1-27.
[33]
Chou, K.C. J. Theor. Biol., 2011, 273, 236-247.
[34]
Qiu, W.R.; Sun, B.Q.; Tang, H.; Huang, J.; Lin, H. Artif. Intell. Med., 2017, 83, 75-81.
[35]
Chen, W.; Feng, P.M.; Yang, H.; Ding, H.; Lin, H.; Chou, K.C. Mol. Ther. Nucleic Acids, 2018, 11, 468-474.
[36]
Lai, H.Y.; Chen, X.X.; Chen, W.; Tang, H.; Lin, H. Oncotarget, 2017, 8(17), 28169-28175.
[37]
Yang, H.; Tang, H.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Ding, H.; Chen, W.; Lin, H. BioMed Res. Int., 2016, 2016, 5413903.
[38]
Zhao, Y.W.; Su, Z.D.; Yang, W.; Lin, H.; Chen, W.; Tang, H. Int. J. Biol. Sci., 2017, 18(9), 1838.
[39]
Feng, P.M.; Lin, H.; Chen, W. Comput. Math. Methods Med., 2013, 2013, 567529.
[40]
Feng, P.M.; Ding, H.; Chen, W.; Lin, H. Comput. Math. Methods Med., 2013, 2013, 530696.
[41]
Chou, K.C.; Zhang, C.T. Crit. Rev. Biochem. Mol. Biol., 1995, 30(4), 275-349.
[42]
Lin, H.; Ding, C.; Song, Q.; Yang, P.; Ding, H.; Deng, K.J.; Chen, W. J. Biomol. Struct. Dyn., 2012, 29(6), 643-649.
[43]
Ding, H.; Lin, H.; Chen, W.; Li, Z.Q.; Guo, F.B.; Huang, J.; Rao, N. Interdiscip. Sci., 2014, 6(3), 235-240.
[44]
Chou, K.C. Protein, 2001, 42(1), 136-139.
[45]
Feng, P.M.; Chen, W.; Lin, H.; Chou, K.C. Anal. Biochem., 2013, 442(1), 118-125.
[46]
Feng, P.M.; Yang, H.; Ding, H.; Lin, H.; Chen, W.; Chou, K.C. Genomics, 2019, 1, 96-102.
[47]
Zhang, J.D.; Feng, P.M.; Lin, H.; Chen, W. Front. Microbiol., 2018, 9, 955.
[48]
Feng, P.M.; Ding, H.; Yang, H.; Chen, W.; Lin, H.; Chou, K.C. Mol. Ther. Nucleic Acids, 2017, 7, 155-163.
[49]
Chen, W.; Xing, P.; Zou, Q. Sci. Rep., 2017, 7, 40242.
[50]
Yi, Y.; Zhao, Y.; Li, C.; Zhang, L.; Huang, H.; Li, Y.; Liu, L.; Hou, P.; Cui, T.; Tan, P.; Hu, Y.; Zhang, T.; Huang, Y.; Li, X.; Yu, J.; Wang, D. Nucleic Acids Res., 2017, 45(D1), D115-D118.
[51]
Cui, T.; Zhang, L.; Huang, Y.; Yi, Y.; Tan, P.; Zhao, Y.; Hu, Y.; Xu, L.; Li, E.; Wang, D. Nucleic Acids Res., 2018, 46(D1), D371-D374.
[52]
Li, Y.; Wang, C.; Miao, Z.; Bi, X.; Wu, D.; Jin, N.; Wang, L.; Wu, H.; Qian, K.; Li, C.; Zhang, T.; Zhang, C.; Yi, Y.; Lai, H.; Hu, Y.; Cheng, L.; Leung, K.S.; Li, X.; Zhang, F.; Li, K.; Li, X.; Wang, D. Nucleic Acids Res., 2015, 43, D578-D582.
[53]
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Li, Y.; Zhang, L.; Yang, H.; Hu, Z.; Zhang, L.; Hu, C.; Li, C.; Qian, K.; Zhang, C.; Huang, Y.; Li, K.; Lin, H.; Wang, D. Nucleic Acids Res., 2017, 45(D1), D135-D138.
[54]
Wu, D.; Huang, Y.; Kang, J.; Li, K.; Bi, X.; Zhang, T.; Jin, N.; Hu, Y.; Tan, P.; Zhang, L.; Yi, Y.; Shen, W.; Huang, J.; Li, X.; Li, X.; Xu, J.; Wang, D. Autophagy, 2015, 11(10), 1917-1926.
[55]
Lin, H.; Liang, Z.Y.; Tang, H.; Chen, W. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2017, 6(3), 235-240.
[56]
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C.; Deng, E.Z.; Tang, H.; Chen, W.; Lin, H. Bioinformatics, 2017, 33, 467-469.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy