A Sequence-Based Predictor of Zika Virus Proteins Developed by Integration of PseAAC and Statistical Moments

Author(s): Waqar Hussain, Nouman Rasool*, Yaser D. Khan

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 23 , Issue 8 , 2020

Become EABM
Become Reviewer
Call for Editor


Background: ZIKV has been a well-known global threat, which hits almost all of the American countries and posed a serious threat to the entire globe in 2016. The first outbreak of ZIKV was reported in 2007 in the Pacific area, followed by another severe outbreak, which occurred in 2013/2014 and subsequently, ZIKV spread to all other Pacific islands. A broad spectrum of ZIKV associated neurological malformations in neonates and adults has driven this deadly virus into the limelight. Though tremendous efforts have been focused on understanding the molecular basis of ZIKV, the viral proteins of ZIKV have still not been studied extensively.

Objectives: Herein, we report the first and the novel predictor for the identification of ZIKV proteins.

Methods: We have employed Chou’s pseudo amino acid composition (PseAAC), statistical moments and various position-based features.

Results: The predictor is validated through 10-fold cross-validation and Jackknife testing. In 10- fold cross-validation, 94.09% accuracy, 93.48% specificity, 94.20% sensitivity and 0.80 MCC were achieved while in Jackknife testing, 96.62% accuracy, 94.57% specificity, 97.00% sensitivity and 0.88 MCC were achieved.

Conclusion: Thus, ZIKVPred-PseAAC can help in predicting the ZIKV proteins efficiently and accurately and can provide baseline data for the discovery of new drugs and biomarkers against ZIKV.

Keywords: ZIKV, prediction, PseAAC, 5-step rule, statistical momentsm, jackknife testing.

Wahid, B.; Ali, A.; Rafique, S.; Idrees, M. Zika: As an emergent epidemic. Asian Pac. J. Trop. Med., 2016, 9(8), 723-729.
[http://dx.doi.org/10.1016/j.apjtm.2016.06.019 ] [PMID: 27569879]
Wahid, B.; Ali, A.; Waqar, M.; Idrees, M. An updated systematic review of Zika virus-linked complications. Asian Pac. J. Trop. Med., 2018, 11(1), 1.
Ali, A.; Wahid, B.; Rafique, S.; Idrees, M. Advances in research on Zika virus. Asian Pac. J. Trop. Med., 2017, 10(4), 321-331.
[http://dx.doi.org/10.1016/j.apjtm.2017.03.020 ] [PMID: 28552102]
Cox, B.D.; Stanton, R.A.; Schinazi, R.F. Predicting Zika virus structural biology: Challenges and opportunities for intervention. Antivir. Chem. Chemother., 2015, 24(3-4), 118-126.
[http://dx.doi.org/10.1177/2040206616653873 ] [PMID: 27296393]
Awais, M.; Hussain, W.; Khan, Y.D.; Rasool, N.; Khan, S.A.; Chou, K-C. iPhosH-PseAAC: Identify phosphohistidine sites inproteins by blending statistical moments and position relative features according to the Chou’s 5-step rule and general pseudo amino acid composition. IEEE/ACM Trans. Comput. Biol. Bioinform., 2019. [E-pub ahead of print].
[http://dx.doi.org/10.1109/TCBB.2019.2919025] [PMID: 31144645]
Hussain, W.; Khan, Y.D.; Rasool, N.; Khan, S.A.; Chou, K-C. SPalmitoylC-PseAAC: A sequence-based model developed via Chou’s 5-steps rule and general PseAAC for identifying S-palmitoylation sites in proteins. Anal. Biochem., 2019, 568, 14-23.
[http://dx.doi.org/10.1016/j.ab.2018.12.019 ] [PMID: 30593778]
Khan, Y.D.; Amin, N.; Hussain, W.; Rasool, N.; Khan, S.A.; Chou, K-C. iProtease-PseAAC(2L): A two-layer predictor for identifying proteases and their types using Chou’s 5-step-rule and general PseAAC. Anal. Biochem., 2020, 588, 113477.
[http://dx.doi.org/10.1016/j.ab.2019.113477 ] [PMID: 31654612]
Khan, Y.D.; Rasool, N.; Hussain, W.; Khan, S.A.; Chou, K-C. iPhosT-PseAAC: Identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Anal. Biochem., 2018, 550, 109-116.
[http://dx.doi.org/10.1016/j.ab.2018.04.021 ] [PMID: 29704476]
Ilyas, S.; Hussain, W.; Ashraf, A.; Khan, Y.D.; Khan, S.A.; Chou, K-C. iMethylK_pseAAC: Improving Accuracy of Lysine Methylation Sites Identification by Incorporating Statistical Moments and Position Relative Features into General PseAAC via Chou’s 5-steps Rule. Curr. Genomics, 2019, 20(4), 275-292.
[http://dx.doi.org/10.2174/1389202920666190809095206 ] [PMID: 32030087]
Akhtar, A.; Amir, A.; Hussain, W.; Ghaffar, A.; Rasool, N. In silico computations of selective phytochemicals as potential inhibitors against major biological targets of diabetes mellitus. Curr. Comput. Aided Drug Des., 2019, 15(5), 401-408.
Akhtar, A.; Hussain, W.; Rasool, N. Probing the pharmacological binding properties, and reactivity of selective phytochemicals as potential HIV-1 protease inhibitors. Univ. Sci., 2019, 24(3), 441-464.
Amjad, H.; Hussain, W.; Rasool, N. Molecular simulation investigation of prolyl oligopeptidase from pyrobaculum calidifontis and in silico docking with substrates and inhibitors. Open Access J. Biomed. Eng. Biosci., 2018, 2(4), 185-194.
Arif, N.; Subhani, A.; Hussain, W.; Rasool, N. In silico inhibition of BACE-1 by selective phytochemicals as novel potential inhibitors: molecular docking and DFT studies. Curr. Drug Discov. Technol., 2019. [E-pub Ahead of Print]
Hussain, W.; Qaddir, I.; Mahmood, S.; Rasool, N. In silico targeting of non-structural 4B protein from dengue virus 4 with spiropyrazolopyridone: study of molecular dynamics simulation, ADMET and virtual screening. Virusdisease, 2018, 29(2), 147-156.
[http://dx.doi.org/10.1007/s13337-018-0446-4 ] [PMID: 29911147]
Qaddir, I.; Rasool, N.; Hussain, W.; Mahmood, S. Computer-aided analysis of phytochemicals as potential dengue virus inhibitors based on molecular docking, ADMET and DFT studies. J. Vector Borne Dis., 2017, 54(3), 255-262.
[http://dx.doi.org/10.4103/0972-9062.217617 ] [PMID: 29097641]
Rasool, N.; Ashraf, A.; Waseem, M.; Hussain, W.; Mahmood, S. Computational exploration of antiviral activity of phytochemicals against NS2B/NS3 proteases from dengue virus. Turkish J. Biochem., 2019, 44(3)
Rasool, N.; Husssain, W.; Khan, Y.D. Revelation of enzyme activity of mutant pyrazinamidases from Mycobacterium tuberculosis upon binding with various metals using quantum mechanical approach. Comput. Biol. Chem., 2019., 83107108.
[http://dx.doi.org/10.1016/j.compbiolchem.2019.107108 ] [PMID: 31442707]
Rasool, N.; Jalal, A.; Amjad, A.; Hussain, W. Probing the pharmacological parameters, molecular docking and quantum computations of plant derived compounds exhibiting strong inhibitory potential against NS5 from Zika virus. Braz. Arch. Biol. Technol., 2018, 61(0)
Chou, K-C. Using subsite coupling to predict signal peptides. Protein Eng., 2001, 14(2), 75-79.
[http://dx.doi.org/10.1093/protein/14.2.75 ] [PMID: 11297664]
Hussain, W.; Ali, M.; Sohail Afzal, M.; Rasool, N. Penta-1,4-diene-3-one oxime derivatives strongly inhibit the replicase domain of tobacco mosaic virus: elucidation through molecular docking and density functional theory mechanistic computations. J. Antivir. Antiretrovir., 2018, 10(3), 28-34.
Hussain, W.; Khan, Y.D.; Rasool, N.; Khan, S.A.; Chou, K-C. SPrenylC-PseAAC: A sequence-based model developed via Chou’s 5-steps rule and general PseAAC for identifying S-prenylation sites in proteins. J. Theor. Biol., 2019, 468, 1-11.
[http://dx.doi.org/10.1016/j.jtbi.2019.02.007 ] [PMID: 30768975]
Hussain, W.; Qaddir, I.; Mahmood, S.; Rasool, N.J.V. In silico targeting of non-structural 4B protein from dengue virus 4 with spiropyrazolopyridone: study of molecular dynamics simulation, ADMET and virtual screening. Virusdisease, 2018, 29(1), 147-156.
Khan, Y.D.; Jamil, M.; Hussain, W.; Rasool, N.; Khan, S.A.; Chou, K-C. pSSbond-PseAAC: Prediction of disulfide bonding sites by integration of PseAAC and statistical moments. J. Theor. Biol., 2018.
[PMID: 30550863]
Khan, Y.D.; Rasool, N.; Hussain, W.; Khan, S.A.; Chou, K-C. iPhosY-PseAAC: identify phosphotyrosine sites by incorporating sequence statistical moments into PseAAC. Mol. Biol. Rep., 2018, 45(6), 2501-2509.
[http://dx.doi.org/10.1007/s11033-018-4417-z ] [PMID: 30311130]
Chou, K-C. Some remarks on protein attribute prediction and pseudo amino acid composition. J. Theor. Biol., 2011, 273(1), 236-247.
[http://dx.doi.org/10.1016/j.jtbi.2010.12.024 ] [PMID: 21168420]
Butt, A.H.; Rasool, N.; Khan, Y.D. Predicting membrane proteins and their types by extracting various sequence features into Chou’s general PseAAC. Mol. Biol. Rep., 2018, 45(6), 2295-2306.
[http://dx.doi.org/10.1007/s11033-018-4391-5 ] [PMID: 30238411]
Fu, L.; Niu, B.; Zhu, Z.; Wu, S.; Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012, 28(23), 3150-3152.
[http://dx.doi.org/10.1093/bioinformatics/bts565 ] [PMID: 23060610]
Chou, K-C. Impacts of bioinformatics to medicinal chemistry. Med. Chem., 2015, 11(3), 218-234.
[http://dx.doi.org/10.2174/1573406411666141229162834 ] [PMID: 25548930]
Zhang, C.T.; Chou, K.C. An optimization approach to predicting protein structural class from amino acid composition. Protein Sci., 1992, 1(3), 401-408.
[http://dx.doi.org/10.1002/pro.5560010312 ] [PMID: 1304347]
Hajisharifi, Z.; Piryaiee, M.; Mohammad Beigi, M.; Behbahani, M.; Mohabatkar, H. Predicting anticancer peptides with Chou’s pseudo amino acid composition and investigating their mutagenicity via Ames test. J. Theor. Biol., 2014, 341, 34-40.
[http://dx.doi.org/10.1016/j.jtbi.2013.08.037 ] [PMID: 24035842]
Meanwell, N.A.; Gastreich, M.; Rarey, M.; Devereux, M.; Popelier, P.L.; Schneider, G.; Willett, P. Perspectives from Medicinal Chemistry. In: Bioisosteres in Medicinal Chemistry; Brown, N., Ed.; Wiley-VCH Verlag GmbH & Co., 2012; pp. 217-230.
Du, P.; Gu, S.; Jiao, Y. PseAAC-General: fast building various modes of general form of Chou’s pseudo-amino acid composition for large-scale protein datasets. Int. J. Mol. Sci., 2014, 15(3), 3495-3506.
[http://dx.doi.org/10.3390/ijms15033495 ] [PMID: 24577312]
Chen, W.; Lin, H.; Chou, K-C. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Mol. Biosyst., 2015, 11(10), 2620-2634.
[http://dx.doi.org/10.1039/C5MB00155B ] [PMID: 26099739]
Khan, Y.D.; Ahmad, F.; Anwar, M.W. A neuro-cognitive approach for iris recognition using back propagation. World Appl. Sci. J., 2012, 16(5), 678-685.
Khan, Y.D.; Ahmed, F.; Khan, S.A. Situation recognition using image moments and recurrent neural networks. Neural Comput. Appl., 2014, 24(7-8), 1519-1529.
Butt, A.H.; Khan, S.A.; Jamil, H.; Rasool, N.; Khan, Y.D. A prediction model for membrane proteins using moments based features. BioMed Res. Int., 2016, 2016, Article ID 8370132.
Butt, A.H.; Rasool, N.; Khan, Y.D. A treatise to computational approaches towards prediction of membrane protein and its subtypes. J. Membr. Biol., 2017, 250(1), 55-76.
[http://dx.doi.org/10.1007/s00232-016-9937-7 ] [PMID: 27866233]
Khan, Y.D.; Khan, S.A.; Ahmad, F.; Islam, S. Iris recognition using image moments and k-means algorithm. ScientificWorldJournal, 2014, 2014, 723595.
Khan, Y.D.; Khan, N.S.; Farooq, S.; Abid, A.; Khan, S.A.; Ahmad, F.; Mahmood, M.K. An efficient algorithm for recognition of human actions. ScientificWorldJournal, 2014, 2014, 875879.
Akmal, M.A.; Rasool, N.; Khan, Y.D. Prediction of N-linked glycosylation sites using position relative features and statistical moments. PLoS One, 2017, 12(8), e0181966.
[http://dx.doi.org/10.1371/journal.pone.0181966 ] [PMID: 28797096]
Chou, K.-C. Prediction of signal peptides using scaled window. peptides, 2001, 22(12), 1973-1979.
Feng, P-M.; Ding, H.; Chen, W.; Lin, H. Naive Bayes classifier with feature selection to identify phage virion proteins. Comput. Math. Methods Med., 2013, 2013, 530696.
Xu, Y.; Shao, X.J.; Wu, L.Y.; Deng, N.Y.; Chou, K.C. iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ, 2013, 1, e171.
[http://dx.doi.org/10.7717/peerj.171 ] [PMID: 24109555]
Xiao, X.; Ye, H-X.; Liu, Z.; Jia, J-H.; Chou, K-C. iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget, 2016, 7(23), 34180-34189.
[http://dx.doi.org/10.18632/oncotarget.9057 ] [PMID: 27147572]
Lin, H.; Deng, E.Z.; Ding, H.; Chen, W.; Chou, K.C. iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition. Nucleic Acids Res., 2014, 42(21), 12961-12972.
[http://dx.doi.org/10.1093/nar/gku1019 ] [PMID: 25361964]
Xiao, X.; Wu, Z-C.; Chou, K-C. iLoc-Virus: a multi-label learning classifier for identifying the subcellular localization of virus proteins with both single and multiple sites. J. Theor. Biol., 2011, 284(1), 42-51.
[http://dx.doi.org/10.1016/j.jtbi.2011.06.005 ] [PMID: 21684290]
Xiao, X.; Wang, P.; Lin, W-Z.; Jia, J-H.; Chou, K-C. iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types. Anal. Biochem., 2013, 436(2), 168-177.
[http://dx.doi.org/10.1016/j.ab.2013.01.019 ] [PMID: 23395824]
Chou, K-C. Some remarks on predicting multi-label attributes in molecular biosystems. Mol. Biosyst., 2013, 9(6), 1092-1100.
[http://dx.doi.org/10.1039/c3mb25555g ] [PMID: 23536215]
Chou, K-C.; Zhang, C-T. Prediction of protein structural classes. Crit. Rev. Biochem. Mol. Biol., 1995, 30(4), 275-349.
[http://dx.doi.org/10.3109/10409239509083488 ] [PMID: 7587280]

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 02 November, 2020
Page: [797 - 804]
Pages: 8
DOI: 10.2174/1386207323666200428115449
Price: $65

Article Metrics

PDF: 23