iAFP-gap-SMOTE: An Efficient Feature Extraction Scheme Gapped Dipeptide Composition is Coupled with an Oversampling Technique for Identification of Antifreeze Proteins

Author(s): Shahid Akbar, Maqsood Hayat*, Muhammad Kabir, Muhammad Iqbal.

Journal Name: Letters in Organic Chemistry

Volume 16 , Issue 4 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Antifreeze proteins (AFPs) perform distinguishable roles in maintaining homeostatic conditions of living organisms and protect their cell and body from freezing in extremely cold conditions. Owing to high diversity in protein sequences and structures, the discrimination of AFPs from non- AFPs through experimental approaches is expensive and lengthy. It is, therefore, vastly desirable to propose a computational intelligent and high throughput model that truly reflects AFPs quickly and accurately. In a sequel, a new predictor called “iAFP-gap-SMOTE” is proposed for the identification of AFPs. Protein sequences are expressed by adopting three numerical feature extraction schemes namely; Split Amino Acid Composition, G-gap di-peptide Composition and Reduce Amino Acid alphabet composition. Usually, classification hypothesis biased towards majority class in case of the imbalanced dataset. Oversampling technique Synthetic Minority Over-sampling Technique is employed in order to increase the instances of the lower class and control the biasness. 10-fold cross-validation test is applied to appraise the success rates of “iAFP-gap-SMOTE” model. After the empirical investigation, “iAFP-gap-SMOTE” model obtained 95.02% accuracy. The comparison suggested that the accuracy of” iAFP-gap-SMOTE” model is higher than that of the present techniques in the literature so far. It is greatly recommended that our proposed model “iAFP-gap-SMOTE” might be helpful for the research community and academia.

Keywords: Antifreeze proteins, Smote, KNN, PNN, SVM, AFPs.

[1]
Fletcher, G.L.; Hew, C.L.; Davies, P.L. Annu. Rev. Physiol., 2001, 63, 359-390.
[2]
Sakai, A.; Larcher, W. Frost Survival of Plants: Responses and Adaptation to Freezing Stress; Springer Science & Business Media: Berlin, 2012.
[3]
Kandaswamy, K.K.; Chou, K-C.; Martinetz, T.; Möller, S.; Suganthan, P.N.; Sridharan, S.; Pugalenthi, G. J. Theor. Biol., 2011, 270(1), 56-62.
[4]
Deswal, R; Sharma, B. J. Proteins & Proteomics, 2014, (5)
[5]
Zhao, X.; Ma, Z.; Yin, M. Int. J. Mol. Sci., 2012, 13(2), 2196-2207.
[6]
Ewart, K.V.; Lin, Q.; Hew, C.L. Cell. Mol. Life Sci., 1999, 55(2), 271-283.
[7]
Logsdon, J.M.; Doolittle, W.F. Proc. Natl. Acad. Sci. USA, 1997, 94(8), 3485-3487.
[8]
Davies, P.L.; Hew, C.L. FASEB J., 1990, 4(8), 2460-2468.
[9]
Davies, P.L.; Baardsnes, J.; Kuiper, M.J.; Walker, V.K. Philos. Trans. R. Soc. Lond. B Biol. Sci., 2002, 357(1423), 927-935.
[10]
Urrutia, M.E.; Duman, J.G.; Knight, C.A. Protein Struct. Mol. Enzymol., 1992, 1121, 199-206.
[11]
Yu, X.M.; Griffith, M. Physiol. Plant., 2001, 112(1), 78-86.
[12]
Griffith, M.; Antikainen, M.; Hon, W.C.; Pihakaski‐Maunsbach, K.; Yu, X.M.; Chun, J.U. Physiol. Plant., 1997, 100, 327-332.
[13]
Duman, J.G. Annu. Rev. Physiol., 2001, 63, 327-357.
[14]
Sformo, T.; Kohl, F.; McIntyre, J.; Kerr, P.; Duman, J.G.; Barnes, B.M. J. Comp. Physiol. B, 2009, 179(7), 897-902.
[15]
Buzzini, P.; Margesin, R. Cold-adapted yeasts; Springer: Berlin, 2014.
[16]
Kandaswamy, K.K.; Pugalenthi, G.; Hartmann, E.; Kalies, K-U.; Möller, S.; Suganthan, P.N.; Martinetz, T. Biochem. Biophys. Res. Commun., 2010, 391(3), 1306-1311.
[17]
Mondal, S.; Pai, P.P. J. Theor. Biol., 2014, 356, 30-35.
[18]
Chen, C.; Chen, L.; Zou, X.; Cai, P. Protein Pept. Lett., 2009, 16(1), 27-31.
[19]
Yu, C-S.; Lu, C-H. PLoS One, 2011, 6(5), e20445.
[20]
Iqbal, M.; Hayat, M. Comput. Methods Programs Biomed., 2016, 128, 1-11.
[21]
Kabir, M.; Iqbal, M.; Ahmad, S.; Hayat, M. Comput. Biol. Med., 2015, 66, 252-257.
[22]
Tang, H.; Su, Z-D.; Wei, H-H.; Chen, W.; Lin, H. Biochem. Biophys. Res. Commun., 2016, 477(1), 150-154.
[23]
Tang, H.; Chen, W.; Lin, H. Mol. Biosyst., 2016, 12(4), 1269-1275.
[24]
Tang, H.; Zou, P.; Zhang, C.; Chen, R.; Chen, W.; Lin, H. Sci. Rep., 2016, 6, 30441.
[25]
Chen, X-X.; Tang, H.; Li, W-C.; Wu, H.; Chen, W.; Ding, H. BioMed Res. Int., 2016.
[http://dx.doi.org/10.1155/2016/1654623]
[26]
Lai, H-Y.; Chen, X-X.; Chen, W.; Tang, H.; Lin, H. Oncotarget, 2017, 8(17), 28169-28175.
[27]
Feng, P-M.; Ding, H.; Chen, W.; Lin, H. Comput. Math. Methods Med., 2013, 2013, 530696.
[28]
He, X.; Han, K.; Hu, J.; Yan, H.; Yang, J-Y.; Shen, H-B.; Yu, D.J. J. Membr. Biol., 2015, 248(6), 1005-1014.
[29]
Yang, R.; Zhang, C.; Gao, R.; Zhang, L. Int. J. Mol. Sci., 2015, 16(9), 21191-21214.
[30]
Feng, P.; Yang, H.; Ding, H.; Lin, H.; Chen, W.; Chou, K-C. Genomics, 2018, 111(1), 96-102.
[31]
Chen, W.; Yang, H.; Feng, P.; Ding, H.; Lin, H. Bioinformatics, 2017, 33(22), 3518-3523.
[32]
Zhao, Y-W.; Su, Z-D.; Yang, W.; Lin, H.; Chen, W.; Tang, H. Int. J. Mol. Sci., 2017, 18(9), 1838.
[33]
Dao, F-Y.; Yang, H.; Su, Z-D.; Yang, W.; Wu, Y.; Hui, D.; Chen, W.; Tang, H.; Lin, H. Molecules, 2017, 22(7), 1057.
[34]
Sonnhammer, E.L.; Eddy, S.R.; Durbin, R. Proteins, 1997, 28(3), 405-420.
[35]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[36]
Chou, K-C. Bioinformatics, 2005, 21(1), 10-19.
[37]
Shen, H-B.; Chou, K-C. Anal. Biochem., 2008, 373(2), 386-388.
[38]
Hayat, M.; Khan, A. J. Theor. Biol., 2012, 292, 93-102.
[39]
Du, P.; Li, Y. BMC Bioinformatics, 2006, 7, 518.
[40]
Verma, R.; Varshney, G.C.; Raghava, G.P. Amino Acids, 2010, 39(1), 101-110.
[41]
Afridi, T.H.; Khan, A.; Lee, Y.S. Amino Acids, 2012, 42(4), 1443-1454.
[42]
Zhang, C-T.; Chou, K.C. Protein Sci.: A Pub. Protein Sci., 1992, 1, 401.
[43]
Kaundal, R.; Saini, R.; Zhao, P.X. Plant Physiol., 2010, 154(1), 36-54.
[44]
Lin, H.; Ding, H. J. Theor. Biol., 2011, 269(1), 64-69.
[45]
Ding, H.; Guo, S-H.; Deng, E-Z.; Yuan, L-F.; Guo, F-B.; Huang, J. Chemom. Intell. Lab. Syst., 2013, 124, 9-13.
[46]
Lin, H.; Chen, W.; Ding, H. PLoS One, 2013, 8(10), e75726.
[47]
Feng, P.; Chen, W.; Lin, H. Interdiscip. Sci., 2016, 8(2), 186-191.
[48]
Chou, K-C.; Shen, H-B. Nat. Sci., 2009, 1, 63.
[49]
Nakashima, H.; Nishikawa, K.; Ooi, T. J. Biochem., 1986, 99(1), 153-162.
[50]
Chou, K-C.; Zhang, C-T. J. Biol. Chem., 1994, 269(35), 22014-22020.
[51]
Chou, K.C. Proteins, 1995, 21(4), 319-344.
[52]
Chou, K.C. Proteins, 2001, 43(3), 246-255.
[53]
Zhang, S-W.; Zhang, Y-L.; Yang, H-F.; Zhao, C-H.; Pan, Q. Amino Acids, 2008, 34(4), 565-572.
[54]
Kandaswamy, K.K.; Pugalenthi, G.; Möller, S.; Hartmann, E.; Kalies, K.U.; Suganthan, P.N.; Martinetz, T. Protein Pept. Lett., 2010, 17(12), 1473-1479.
[55]
Hayat, M.; Khan, A. Protein Pept. Lett., 2012, 19(4), 411-421.
[56]
Zou, D.; He, Z.; He, J.; Xia, Y. J. Comput. Chem., 2011, 32(2), 271-278.
[57]
Nanni, L.; Lumini, A.; Gupta, D.; Garg, A. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2012, 9, 467-475.
[58]
Khosravian, M.; Faramarzi, F.K.; Beigi, M.M.; Behbahani, M.; Mohabatkar, H. Protein Pept. Lett., 2013, 20(2), 180-186.
[59]
Chen, Y-K.; Li, K-B. J. Theor. Biol., 2013, 318, 1-12.
[60]
Lin, H.; Li, Q.Z. J. Comput. Chem., 2007, 28(9), 1463-1466.
[61]
Liao, B.; Jiang, J-B.; Zeng, Q-G.; Zhu, W. Protein Pept. Lett., 2011, 18(11), 1086-1092.
[62]
Wang, T.; Yang, J.; Shen, H-B.; Chou, K-C. Protein Pept. Lett., 2008, 15(9), 915-921.
[63]
Lin, W-Z.; Fang, J-A.; Xiao, X.; Chou, K-C. Mol. Biosyst., 2013, 9(4), 634-644.
[64]
Feng, P.; Lin, H.; Chen, W.; Zuo, Y. BioMed Res. Int., 2014.
[http://dx.doi.org/10.1155/2014/935719]
[65]
Feng, P-M.; Chen, W.; Lin, H.; Chou, K-C. Anal. Biochem., 2013, 442(1), 118-125.
[66]
de Brevern, A.G.; Etchebest, C.; Hazout, S. Proteins, 2000, 41(3), 271-287.
[67]
de Brevern, A.G. In Silico Biol., 2005, 5(3), 283-289.
[68]
Joseph, A.P.; Agarwal, G.; Mahajan, S.; Gelly, J-C.; Swapna, L.S.; Offmann, B.; Cadet, F.; Bornot, A.; Tyagi, M.; Valadié, H.; Schneider, B.; Etchebest, C.; Srinivasan, N.; De Brevern, A.G. Biophys. Rev., 2010, 2(3), 137-147.
[69]
Chen, W.; Feng, P.; Lin, H. J. Ind. Microbiol. Biotechnol., 2012, 39(4), 579-584.
[70]
Zuo, Y-C.; Li, Q-Z. Peptides, 2009, 30(10), 1788-1793.
[71]
Chen, Y-L.; Li, Q-Z.; Zhang, L-Q. Amino Acids, 2012, 42(4), 1309-1316.
[72]
Etchebest, C.; Benros, C.; Bornot, A.; Camproux, A-C.; de Brevern, A.G. Eur. Biophys. J., 2007, 36(8), 1059-1069.
[73]
Ahmad, K.; Waris, M.; Hayat, M. J. Membr. Biol., 2016, 249(3), 293-304.
[74]
Kabir, M.; Hayat, M. Mol. Genet. Genomics, 2016, 291(1), 285-296.
[75]
Akbar, S.; Ahmad, A.; Hayat, M. IJCSI, 2014, 11, 1694-0814.
[76]
Ahmad, S.; Kabir, M.; Hayat, M. Comput. Methods Programs Biomed., 2015, 122(2), 165-174.
[77]
Akbar, S.; Ahmad, A.; Hayat, M.; Ali, F. J. Appl. Environ. Biol. Sci., 2015, 5, 28-36.
[78]
Ali, F.; Hayat, M. J. Theor. Biol., 2015, 384, 78-83.
[79]
Specht, D.F. IEEE Trans. Neural Netw., 1990, 1(1), 111-121.
[80]
Wang, S-L.; Li, X.; Zhang, S.; Gui, J.; Huang, D-S. Comput. Biol. Med., 2010, 40(2), 179-189.
[81]
Waris, M.; Ahmad, K.; Kabir, M.; Hayat, M. Neurocomputing, 2016, 199, 154-162.
[82]
Cherkassky, V.; Ma, Y. Neural Netw., 2004, 17(1), 113-126.
[83]
Tahir, M.; Hayat, M. Mol. Biosyst., 2016, 12(8), 2587-2593.
[84]
Akbar, S.; Hayat, M.; Iqbal, M.; Jan, M.A. Artif. Intell. Med., 2017, 79, 62-70.
[85]
Khan, F.; Akbar, S.; Basit, A.; Khan, I.; Akhlaq, H. Proc. 2017 4th Int. Conf. Biomed. Bioinform. Eng., 2017, pp. 91-96.
[86]
Yang, H.; Tang, H.; Chen, X-X.; Zhang, C-J.; Zhu, P-P.; Ding, H. BioMed Res. Int., 2016, 2016, 5413903.
[87]
Shao, J.; Xu, D.; Tsai, S-N.; Wang, Y.; Ngai, S-M. PLoS One, 2009, 4(3), e4920.
[88]
Lin, H.; Liang, Z-Y.; Tang, H.; Chen, W. EEE/ACM Trans. Comput.Biol. Bioinform. 2017.
[http://dx.doi.org/10.1109/TCBB.2017.2666141.]
[89]
Zhao, Y-W.; Lai, H-Y.; Tang, H.; Chen, W.; Lin, H. Sci. Rep., 2016, 6, 34817.
[90]
Ali, F.; Hayat, M. J. Theor. Biol., 2016, 403, 30-37.
[91]
Vapnik, V.N.; Vapnik, V. Statistical Learning Theory; Wiley: New York, 1998.
[92]
Chen, W.; Feng, P-M.; Lin, H.; Chou, K-C. BioMed Res. Int., 2014.
[http://dx.doi.org/10.1155/2014/623149]
[93]
Chen, W.; Feng, P-M.; Lin, H.; Chou, K-C. Nucleic Acids Res., 2013, 41(6), e68.


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 16
ISSUE: 4
Year: 2019
Page: [294 - 302]
Pages: 9
DOI: 10.2174/1570178615666180816101653
Price: $58

Article Metrics

PDF: 17
HTML: 2