Remarks on Computational Method for Identifying Acid and Alkaline Enzymes

Author(s): Hongfei Li, Haoze Du, Xianfang Wang*, Peng Gao, Yifeng Liu, Weizhong Lin

Journal Name: Current Pharmaceutical Design

Volume 26 , Issue 26 , 2020

Become EABM
Become Reviewer
Call for Editor


The catalytic efficiency of the enzyme is thousands of times higher than that of ordinary catalysts. Thus, they are widely used in industrial and medical fields. However, enzymes with protein structure can be destroyed and inactivated in high temperature, over acid or over alkali environment. It is well known that most of enzymes work well in an environment with pH of 6-8, while some special enzymes remain active only in an alkaline environment with pH > 8 or an acidic environment with pH < 6. Therefore, the identification of acidic and alkaline enzymes has become a key task for industrial production. Because of the wide varieties of enzymes, it is hard work to determine the acidity and alkalinity of the enzyme by experimental methods, and even this task cannot be achieved. Converting protein sequences into digital features and building computational models can efficiently and accurately identify the acidity and alkalinity of enzymes. This review summarized the progress of the digital features to express proteins and computational methods to identify acidic and alkaline enzymes. We hope that this paper will provide more convenience, ideas, and guides for computationally classifying acid and alkaline enzymes.

Keywords: Amino acid composition, pseudo amino acid composition, evolutionary information, dipeptide composition, average chemical shift, feature selection techniques.

Baker-Austin C, Dopson M. Life in acid: pH homeostasis in acidophiles. Trends Microbiol 2007; 15(4): 165-71.
[] [PMID: 17331729]
AP. D., K. EG, and P. AC, Enzyme adaptation to alkaline pH: atomic resolution (1.08 A) structure of phosphoserine aminotransferase from Bacillus alcalophilus. Protein Sci 2010; 14(1): 97-110.
Jaenicke R, Böhm G. The stability of proteins in extreme environments. Curr Opin Struct Biol 1998; 8(6): 738-48.
[] [PMID: 9914256]
Lineweaver H, Burk D. The determination of enzyme dissociation constants. J Am Chem Soc 1934; 56(3): 658-66.
Miners JO, Birkett DJ. Cytochrome P4502C9: an enzyme of major importance in human drug metabolism. Br J Clin Pharmacol 1998; 45(6): 525-38.
[] [PMID: 9663807]
Khersonsky O, Tawfik DS. Enzyme promiscuity: a mechanistic and evolutionary perspective. Annu Rev Biochem 2010; 79(1): 471-505.
[] [PMID: 20235827]
Kelch B, Eagen KP, Erciyas P, et al. Structural and mechanistic exploration of Acid resistance: kinetic stability facilitates evolution of extremophilic behavior J Mol Biol 2007; 368(3): 0-883..
Cao R, Freitas C, Chan L, Sun M, Jiang H, Chen Z. ProLanGO: protein function prediction using neural machine translation based on a recurrent neural network. Molecules 2017; 22(10)E1732
[] [PMID: 29039790]
Zhang G, Li H, Fang B. Discriminating acidic and alkaline enzymes using a random forest model with secondary structure amino acid composition. Process Biochem 2009; 44(6): 654-60.
Cheng L, Jiang Y, Ju H, et al. InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk. BMC Genomics 2018; 19(Suppl. 1): 919.
[] [PMID: 29363423]
Zhang Z, Zhang J, Fan C, Tang Y, Deng L. KATZLGO: Largescale prediction of LncRNA functions by using the KATZ measure based on multiple networks. IEEE/ACM Trans Comput Biol Bioinform 2019; 16(2): 407-16..
Zou Q, Chen W, Huang Y, Liu X, Jiang Y. Identifying multi-functional enzyme by hierarchical multi-label classifier. J Comput Theor Nanosci 2013; 10(4): 1038-43.
Cheng X-Y, Huang WJ, Hu SC, et al. A global characterization and identification of multifunctional enzymes. PLoS One 2012; 7(6)e38979
[] [PMID: 22723914]
Fan GL, Li QZ, Zuo YC. Predicting acidic and alkaline enzymes by incorporating the average chemical shift and gene ontology informations into the general form of Chou’s PseAAC. Process Biochem 2013; 48(7): 1048-53.
Khan ZU, Hayat M, Khan MA. Discrimination of acidic and alkaline enzyme using Chou’s pseudo amino acid composition in conjunction with probabilistic neural network model. J Theor Biol 2015; 365: 197-203.
[] [PMID: 25452135]
Xianfang W, Li H, Gao P, Liu Y, Zeng W. Combining support vector machine with dual g-gap dipeptides to discriminate between acidic and alkaline enzymes. Lett Org Chem 2019; 16(4): 325-31.
Zhang G, Gao J, Fang B. [Amino acid composition and classification of acidic and alkaline enzymes]. Sheng Wu Gong Cheng Xue Bao 2009; 25(1): 95-100.
[PMID: 19441233]
Lin H, Chen W, Ding H. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes. PLoS One 2013; 8(10)e75726
[] [PMID: 24130738]
Zhang T, Tan P, Wang L, et al. RNALocate: a resource for RNA subcellular localizations. Nucleic Acids Res 2017; 45(D1): D135-8.
[PMID: 27543076]
Yi Y, Zhao Y, Li C, et al. RAID v2.0: an updated resource of RNA-associated interactions across organisms. Nucleic Acids Res 2017; 45(D1): D115-8.
[] [PMID: 27899615]
Yang J, Chen X, McDermaid A, Ma Q. DMINDA 2.0: integrated and systematic views of regulatory DNA motif identification and analyses. Bioinformatics 2017; 33(16): 2586-8.
[] [PMID: 28419194]
Liang ZY, Lai HY, Yang H, et al. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics 2017; 33(3): 467-9.
[PMID: 28171531]
Cheng L, Yang H, Zhao H, et al. MetSigDis: a manually curated resource for the metabolic signatures of diseases. Brief Bioinform 2019; 20(1): 203-9.
[] [PMID: 28968812]
Hu B, Zheng L, Long C, et al. EmExplorer: a database for exploring time activation of gene expression in mammalian embryos. Open Biol 2019; 9(6)190054
[] [PMID: 31164042]
Chang A, Scheer M, Grote A, Schomburg I, Schomburg D. BRENDA, AMENDA and FRENDA the enzyme information system: new content and tools in 2009. Nucleic Acids Res 2009; 37(Database issue): D588-92.
[] [PMID: 18984617]
Zhu XJ, Feng C-Q, Lai H-Y, Chen W, Lin H. Predicting protein structural classes for low-similarity sequences by evaluating different features. Knowl Base Syst 2019; 163: 787-93.
Yang H, Tang H, Chen XX, et al. Identification of secretory proteins in Mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res Int 2016; 20165413903
[] [PMID: 27597968]
Tang H, Chen W, Lin H. Identification of immunoglobulins using Chou’s pseudo amino acid composition with feature selection technique. Mol Biosyst 2016; 12(4): 1269-75.
[] [PMID: 26883492]
Chen XX, Tang H, Li WC, et al. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res Int 2016; 20161654623
[] [PMID: 27437396]
Pan Y, Wang S, Zhang Q, et al. Analysis and prediction of animal toxins by various Chou’s pseudo components and reduced amino acid compositions. J Theor Biol 2019; 462: 221-9.
[] [PMID: 30452961]
Liu D, Li G, Zuo Y. Function determinants of TET proteins: the arrangements of sequence motifs with specific codes In: Brief Bioinform. 2018.
Feng P-M, Lin H, Chen W. Chen Identification of antioxidants from sequence information using Naive Bayes. Computational and mathematical methods in medicine 2013; 2013567529
Ding H, Deng EZ, Yuan LF, et al. iCTX-type: a sequence-based predictor for identifying the types of conotoxins in targeting ion channels. BioMed Res Int 2014; 2014286419
[] [PMID: 24991545]
Tan JX, Li SH, Zhang ZM, et al. Identification of hormone binding proteins based on machine learning methods. Math Biosci Eng 2019; 16(4): 2466-80.
[] [PMID: 31137222]
Chou KC. Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes. Bioinformatics 2005; 21(1): 10-9.
[] [PMID: 15308540]
Chou K-C. Some remarks on protein attribute prediction and pseudo amino acid composition. J Theor Biol 2011; 273(1): 236-47.
[] [PMID: 21168420]
Long CS, et al. Transcriptome comparisons of multi-species identify differential genome activation of mammals embryogenesisIEEE Access 2019; 7: 7794-802
Zuo YC, Peng Y, Liu L, Chen W, Yang L, Fan GL. Predicting peroxidase subcellular location by hybridizing different descriptors of Chou’ pseudo amino acid patterns. Anal Biochem 2014; 458: 14-9.
[] [PMID: 24802134]
Long HX, Wang M, Fu HY. Deep convolutional neural networks for predicting hydroxyproline in proteins. Curr Bioinform 2017; 12(3): 233-8.
Wei L, Tang J, Zou Q. Local-DPP: An improved DNA-binding protein prediction method by exploring local evolutionary information. Inf Sci 2017; 384: 135-44.
Lin H, Ding C, Song Q, et al. The prediction of protein structural class using averaged chemical shifts. J Biomol Struct Dyn 2012; 29(6): 643-9.
[] [PMID: 22545995]
Yuan LZ, Feng E, Wei Z, Shan KG. Using quadratic discriminant analysis to predict protein secondary structure based on chemical shifts. Curr Bioinform 2017; 12(1): 52-6.
Zhao W, Feng YE. Identify protein 8-class secondary structure with quadratic discriminant algorithm based on the feature combination. Lett Org Chem 2017; 14(9): 625-31.
[PMID: 29123460]
Fan GL, Li QZ. Predicting protein submitochondria locations by combining different descriptors into the general form of Chou’s pseudo amino acid composition. Amino Acids 2012; 43(2): 545-55.
[] [PMID: 22102053]
Cao R, Wang Z, Wang Y, Cheng J. SMOQ: a tool for predicting the absolute residue-specific quality of a single protein model with support vector machines. BMC Bioinformatics 2014; 15: 120.
[] [PMID: 24776231]
Li DP, Ju Y, Zou Q. Protein Folds Prediction with Hierarchical Structured SVM. Curr Proteomics 2016; 13(2): 79-85.
Zou Q, Zhao T, Liu Y, Guo M. Predicting RNA secondary structure based on the class information and Hopfield network. Comput Biol Med 2009; 39(3): 206-14.
[] [PMID: 19215914]
Guo XL, Gao L, Wang Y, et al. Large-scale investigation of long noncoding RNA secondary structures in human and mouse. Curr Bioinform 2018; 13(5): 450-60.
Frishman D, Argos P. Seventy-five percent accuracy in protein secondary structure prediction. Proteins 1997; 27(3): 329-35.
[<329:AID-PROT1>3.0.CO;2-8] [PMID: 9094735]
Zuo Y, Li Y, Chen Y, Li G, Yan Z, Yang L. PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition. Bioinformatics 2017; 33(1): 122-4.
[] [PMID: 27565583]
Feng PM, Chen W, Lin H, Chou KC. iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition. Anal Biochem 2013; 442(1): 118-25.
[] [PMID: 23756733]
Zuo Y, Lv Y, Wei Z, Yang L, Li G, Fan G. iDPF-PseRAAAC: A web-server for identifying the defensin peptide family and subfamily using pseudo reduced amino acid alphabet composition. PLoS One 2015; 10(12)e0145541
[] [PMID: 26713618]
Etchebest C, Benros C, Bornot A, Camproux AC, de Brevern AG. A reduced amino acid alphabet for understanding and designing protein adaptation to mutation. Eur Biophys J 2007; 36(8): 1059-69.
[] [PMID: 17565494]
Zuo YC, Li QZ. Using reduced amino acid composition to predict defensin family and subfamily: Integrating similarity measure and structural alphabet. Peptides 2009; 30(10): 1788-93.
[] [PMID: 19591890]
Zuo YC, Li QZ. Using K-minimum increment of diversity to predict secretory proteins of malaria parasite based on groupings of amino acids. Amino Acids 2010; 38(3): 859-67.
[] [PMID: 19387791]
Feng P, Lin H, Chen W, Zuo Y. Predicting the types of J-proteins using clustered amino acids. BioMed Res Int 2014; 2014(2)935719
[] [PMID: 24804260]
Chen W, Feng P, Liu T, Din J. Recent advances in machine learning methods for predicting heat shock proteins. Curr Drug Metab 2019; 20(3)
[PMID: 30378494]
Tang H, Zhang C, Chen R, Huang P, Duan C, Zou P, et al. Identification of secretory proteins of malaria parasite by feature selection technique. Lett Org Chem 2017; 14(9): 621-4.
Feng P-M, Ding H, Chen W, Lin H. Naïve Bayes classifier with feature selection to identify phage virion proteins. Computational and mathematical methods in medicine 2013; 2013530696
Dong W, Han S, Qu X, Bao W, Chen Y. Fan Y. A novel feature fusion method for predicting protein subcellular localization with multiple sites. International Conference on Informative & Cybernetics for Computational Social Systems.
Ding H, Guo S-H, Deng E-Z, et al. Prediction of Golgi-resident protein types by using feature selection technique. Chemom Intell Lab Syst 2013; 124: 9-13.
Zou Q, Zeng J, Cao L, Ji R. A novel features ranking metric with application to scalable visual and bioinformatics data classification. Neurocomputing 2016; 173: 346-54.
Fan G-L, Li Q-Z. Predict mycobacterial proteins subcellular locations by incorporating pseudo-average chemical shift into the general form of Chou’s pseudo amino acid composition. J Theor Biol 2012; 304: 88-95.
[] [PMID: 22459701]
Zhao X, Pei Z, Liu J, Qin S, Cai L. Prediction of nucleosome DNA formation potential and nucleosome positioning using increment of diversity combined with quadratic discriminant analysis. Chromosome Res 2010; 18(7): 777-85.
[] [PMID: 20953693]
Ding H, Li D. Identification of mitochondrial proteins of malaria parasite using analysis of variance. Amino Acids 2015; 47(2): 329-33.
[] [PMID: 25385313]
Basith S, Manavalan B, Shin TH, Lee G. iGHBP: Computational identification of growth hormone binding proteins from sequences using extremely randomised tree. Comput Struct Biotechnol J 2018; 16: 412-20.
[] [PMID: 30425802]
Bao Y, Marini S, Tamura T, et al. Toward more accurate prediction of caspase cleavage sites: a comprehensive review of current methods, tools and features. Brief Bioinform 2018; 20(5): 1669-84.
[PMID: 29860277]
Yang W, Zhu X-J, Huang J, Ding H, Lin H. A brief survey of machine learning methods in protein sub-Golgi localization. Curr Bioinform 2019; 14: 234-40.
Stephenson N, et al. Survey of machine learning techniques in drug discovery. Curr Drug Metab 2018; 26(6): 1241-50.
Song J, Li F, Leier A, et al. PROSPERous: high-throughput prediction of substrate cleavage sites for 90 proteases with improved accuracy. Bioinformatics 2018; 34(4): 684-7.
[] [PMID: 29069280]
Rizk-Allah RM, El-Sehiemy RA, Wang GG. A novel parallel hurricane optimization algorithm for secure emission/economic load dispatch solution. Appl Soft Comput 2018; 63: 206-22.
Manavalan B, Subramaniyam S, Shin TH, Kim MO, Lee G. Machine-learning-based prediction of cell-penetrating peptides and their uptake efficiency with improved accuracy. J Proteome Res 2018; 17(8): 2715-26.
[] [PMID: 29893128]
Zuo YC, Su WX, Zhang SH, et al. Discrimination of membrane transporter protein types using K-nearest neighbor method derived from the similarity distance of total diversity measure. Mol Biosyst 2015; 11(3): 950-7.
[] [PMID: 25607774]
Yin JB, Fan YX, Shen HB. Conotoxin superfamily prediction using diffusion maps dimensionality reduction and subspace classifier. Curr Protein Pept Sci 2011; 12(6)
Fernandez-Lozano C, Fernández-Blanco E, Dave K, et al. Improving enzyme regulatory protein classification by means of SVM-RFE feature selection. Mol Biosyst 2014; 10(5): 1063-71.
[] [PMID: 24556806]
Chen W, Lv H, Nie F, Lin H. i6mA-Pred: identifying DNA N6-methyladenine sites in the rice genome. Bioinformatics 2019; 35(16): 2796-800.
[] [PMID: 30624619]
Xu ZC, Feng PM, Yang H, Qiu WR, Chen W, Lin H. iRNAD: a computational tool for identifying D modification sites in RNA sequence. Bioinformatics 2019; 35(23): 4922-9.
[] [PMID: 31077296]
Feng CQ, Zhang Z-Y, Zhu X-J, et al. iTerm-PseKNC: a sequence-based tool for predicting bacterial transcriptional terminators. Bioinformatics 2019; 35(9): 1469-77.
Dao FY, Lv H, Wang F, et al. Identify origin of replication in Saccharomyces cerevisiae using two-step feature selection technique. Bioinformatics 2018.
[PMID: 30428009]
Manavalan B, Shin TH, Lee G. DHSpred: support-vector-machine-based human DNase I hypersensitive sites prediction using the optimal features selected by random forest. Oncotarget 2017; 9(2): 1944-56.
[PMID: 29416743]
Manavalan B, Shin TH, Lee G. PVP-SVM: Sequence-based prediction of phage virion proteins using a support vector machine. Front Microbiol 2018; 9: 476.
[] [PMID: 29616000]
Zhao YW, Su ZD, Yang W, Lin H, Chen W, Tang H. IonchanPred 2.0: A tool to predict ion channels and their types. Int J Mol Sci 2017; 18(9)E1838
[] [PMID: 28837067]
Lin H, Liang Z-Y, Tang H, Chen W. Identifying sigma70 promoters with novel pseudo nucleotide composition IEEE/ACM Trans Comput Biol Bioinform 2017; PP(99): 1-1.
Bu HD, Hao J, Guan J, Zhou J. Predicting enhancers from multiple cell lines and tissues across different developmental stages based on svm method. Curr Bioinform 2018; 13(6): 655-60.
Zhao XW, Zou Q, Lin B, Liu X. Exploratory predicting protein folding model with random forest and hybrid features. Curr Proteomics 2014; 11(4): 289-99.
Lv H, Zhang ZM, Li SH, Tan JX, Chen W, Lin H. Evaluation of different computational methods on 5-methylcytosine sites identification. Brief Bioinform 2019.bbz048
[PMID: 31157855]
Su R, Liu X, Wei L, Zou Q. Deep-Resp-Forest: A deep forest model to predict anti-cancer drug response. Methods (San Diego, Calif) 2019.
Liao ZJ, Li D, Wang X, Li L, Zou Q. Cancer diagnosis through IsomiR expression with machine learning method. Curr Bioinform 2018; 13(1): 57-63.
Ru X, Li L, Zou Q. Incorporating distance-based top-n-gram and random forest to identify electron transport proteins. J Proteome Res 2019; 18(7): 2931-9.
[] [PMID: 31136183]
Zou Q, Xing P, Wei L, Liu B. Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA 2019; 25(2): 205-18.
[] [PMID: 30425123]
Nie LL, Zhang W, Shi Y, Tang Y. Prediction of protein S-sulfenylation sites using a deep belief network. Curr Bioinform 2018; 13(5): 461-7.
Ding H, Yang W, Tang H, et al. PHYPred: a tool for identifying bacteriophage enzymes and hydrolases. Virol Sin 2016; 31(4): 350-2.
[] [PMID: 27151186]
Lai HY, Chen XX, Chen W, Tang H, Lin H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget 2017; 8(17): 28169-75.
[] [PMID: 28423655]
Tang H, He C-M, Cao R-Z, Wang W, Liu T-S, Wang M-L. A two-step discriminated method to identify thermophilic proteins. Int J Biomath 2017; 10(4)
Tang H, Zhao YW, Zou P, et al. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14(8): 957-64.
[] [PMID: 29989085]
Yang H, Lv H, Ding H, Chen W, Lin H. iRNA-2OM: A sequence-based predictor for identifying 2′-O-methylation sites in Homo sapiens. J Comput Biol 2018; 25(11): 1266-77.
[] [PMID: 30113871]
Zhu PP, Li WC, Zhong ZJ, et al. Predicting the subcellular localization of mycobacterial proteins by incorporating the optimal tripeptides into the general form of pseudo amino acid composition. Mol Biosyst 2015; 11(2): 558-63.
[] [PMID: 25437899]
Zou Q, Wan S, Ju Y, Tang J, Zeng X. Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy. BMC Syst Biol 2016; 10(Suppl. 4): 114.
[] [PMID: 28155714]
Chen W, Yang H, Feng P, Ding H, Lin H. iDNA4mC: identifying DNA N4-methylcytosine sites based on nucleotide chemical properties. Bioinformatics 2017; 33(22): 3518-23.
[] [PMID: 28961687]
Cheng L, Hu Y, Sun J, Zhou M, Jiang Q. DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function. Bioinformatics 2018; 34(11): 1953-6.
[] [PMID: 29365045]
Cheng L, Wang P, Tian R, et al. LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse. Nucleic Acids Res 2019; 47(D1): D140-4.
[] [PMID: 30380072]
Deng L, Wang J, Zhang J. Predicting gene ontology function of human microRNAs by integrating multiple networks. Front Genet 2019; 10: 3.
[] [PMID: 30761178]
Chen W, Feng PM, Lin H, Chou KC. iSS-PseDNC: identifying splicing sites using pseudo dinucleotide composition. BioMed Res Int 2014; 2014623149
[] [PMID: 24967386]
Feng P, et al. iDNA6mA-PseKNC: Identifying DNA N(6)-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018.
Li Y, Niu M, Zou Q. ELM-MHC: An improved mhc identification method with extreme learning machine algorithm. J Proteome Res 2019; 18(3): 1392-401.
[] [PMID: 30698979]
Zhang Z, Zhao Y, Liao X, et al. Deep learning in omics: a survey and guideline. Brief Funct Genomics 2019; 18(1): 41-57.
[] [PMID: 30265280]
Yu L, Xia S, Tian S, Shi X, Yan Y. Drug and nondrug classification based on deep learning with various feature selection strategies. Curr Bioinform 2018; 13(3): 253-9.
Wei L, Ding Y, Su R, Tang J, Zou Q. Prediction of human protein subcellular localization using deep learning. J Parallel Distrib Comput 2018; 117: 212-7.
Peng L, et al. The advances and challenges of deep learning application in biological big data processing. Curr Bioinform 2018; 13(4): 352-9.
Lv Z, Ao C, Zou Q. Protein function prediction: from traditional classifier to deep learning. Proteomics 2019; 19(14)e1900119
[] [PMID: 31187588]

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 11 August, 2020
Page: [3105 - 3114]
Pages: 10
DOI: 10.2174/1381612826666200617170826
Price: $65

Article Metrics

PDF: 23