A Review of Recent Advances and Research on Drug Target Identification Methods

Author(s): Yang Hu*, Tianyi Zhao, Ningyi Zhang, Ying Zhang*, Liang Cheng*.

Journal Name: Current Drug Metabolism

Volume 20 , Issue 3 , 2019

Submit Manuscript
Submit Proposal

Graphical Abstract:


Background: From a therapeutic viewpoint, understanding how drugs bind and regulate the functions of their target proteins to protect against disease is crucial. The identification of drug targets plays a significant role in drug discovery and studying the mechanisms of diseases. Therefore the development of methods to identify drug targets has become a popular issue.

Methods: We systematically review the recent work on identifying drug targets from the view of data and method. We compiled several databases that collect data more comprehensively and introduced several commonly used databases. Then divided the methods into two categories: biological experiments and machine learning, each of which is subdivided into different subclasses and described in detail.

Results: Machine learning algorithms are the majority of new methods. Generally, an optimal set of features is chosen to predict successful new drug targets with similar properties. The most widely used features include sequence properties, network topological features, structural properties, and subcellular locations. Since various machine learning methods exist, improving their performance requires combining a better subset of features and choosing the appropriate model for the various datasets involved.

Conclusion: The application of experimental and computational methods in protein drug target identification has become increasingly popular in recent years. Current biological and computational methods still have many limitations due to unbalanced and incomplete datasets or imperfect feature selection methods.

Keywords: Drug target, machine learning, biological experiment, protein structure, drug database, biological method.

Keller, T.H.; Pichota, A.; Yin, Z. A practical view of druggability. Curr. Opin. Chem. Biol., 2006, 10, 357-361.
Bakheet, T.M.; Doig, A.J. Properties and identification of human protein drug targets. Bioinformatics, 2009, 25, 451-457.
Hopkins, A.L.; Groom, C.R. The druggable genome. Nat. Rev. Drug Discov., 2002, 1, 727-730.
Drews, J. Drug discovery: a historical perspective. Science, 2000, 287, 1960-1964.
Li, Z.C.; Zhong, W.Q.; Liu, Z.Q.; Huang, M.H.; Xie, Y.; Dai, Z.; Zou, X.Y. Large-scale identification of potential drug targets based on the topological features of human protein-protein interaction network. Anal. Chim. Acta, 2015, 871, 18-27.
Overington, J.P.; Allazikani, B.; Hopkins, A.L. How many drug targets are there? Nat. Rev. Drug Discov., 2006, 5, 993-996.
Zhu, M.; Gao, L.; Li, X.; Liu, Z.; Xu, C.; Yan, Y.; Walker, E.; Jiang, W.; Su, B.; Chen, X. The analysis of the drug-targets based on the topological properties in the human protein-protein interaction network. J. Drug Target., 2009, 17, 524-532.
Xiao, X.; Wang, P.; Chou, K.C. GPCR-CA: A cellular automaton image approach for predicting G-protein-coupled receptor functional classes. J. Comput. Chem., 2010, 30, 1414-1423.
Zheng, C.J.; Han, L.Y.; Yap, C.W.; Ji, Z.L.; Cao, Z.W.; Chen, Y.Z. Therapeutic targets: Progress of their exploration and investigation of their characteristics. Pharmacol. Rev., 2006, 58, 259.
Feng, P.; Hui, D.; Hao, L.; Wei, C. AOD: The antioxidant protein database. Sci. Rep., 2017, 7, 7449.
Zhang, T.; Tan, P.; Wang, L.; Jin, N.; Wang, D. RNALocate: A resource for RNA subcellular localizations. Nucleic Acids Res., 2016, 45, D135-D138.
Liang, Z.Y.; Lai, H.Y.; Yang, H.; Zhang, C.J.; Yang, H.; Wei, H.H.; Chen, X.X.; Zhao, Y.W.; Su, Z.D.; Li, W.C. Pro54DB: a database for experimentally verified sigma-54 promoters. Bioinformatics, 2017, 33, 467.
Law, V.; Knox, C.; Djoumbou, Y.; Jewison, T.; Guo, A.C.; Liu, Y.; Maciejewski, A.; Arndt, D.; Wilson, M.; Neveu, V. DrugBank 4.0: shedding new light on drug metabolism. Nucleic Acids Res., 2014, 42, D1091.
Berman, H.M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T.N.; Weissig, H.; Shindyalov, I.N.; Bourne, P.E. The Protein Data Bank. Nucleic Acids Res., 2002, 28, 235.
Hewett, M.; Oliver, D.E.; Rubin, D.L.; Easton, K.L.; Stuart, J.M.; Altman, R.B.; Klein, T.E. PharmGKB: The Pharmacogenetics Knowledge Base. Nucleic Acids Res., 2002, 30, 163.
Peter, D.A.; Grondin, M.C.; Robin, J.; Lay, J.M.; Kelley, L.H.; Cynthia, S.R.; Daniela, S.; King, B.L.; Rosenstein, M.C.; Wiegers, T.C. The comparative toxicogenomics database: Update 2013. Nucleic Acids Res., 2013, 41, D1104-D1114.
Lim, E.; Pon, A.; Djoumbou, Y.; Knox, C.; Shrivastava, S.; Guo, A.C.; Neveu, V.; Wishart, D.S. T3DB: A comprehensively annotated database of common toxins and their targets. Nucleic Acids Res., 2010, 38, D781-D786.
Pontn, F.; Jirstrm, K.; Uhlen, M. The human protein atlas--a tool for pathology. J. Pathol., 2010, 216, 387-393.
Zhu, F.; Shi, Z.; Qin, C.; Tao, L.; Liu, X.; Xu, F.; Zhang, L.; Song, Y.; Liu, X.; Zhang, J. Therapeutic target database update 2012: A resource for facilitating target-oriented drug discovery. Nucleic Acids Res., 2012, 40, D1128.
Gao, Z.; Li, H.; Zhang, H.; Liu, X.; Kang, L.; Luo, X.; Zhu, W.; Chen, K.; Wang, X.; Jiang, H. PDTD: A web-accessible protein database for drug target identification. BMC Bioinformatics, 2008, 9, 104.
Stuart, A.C.; Ilyin, V.A.; Sali, A. LigBase: a database of families of aligned ligand binding sites in known protein sequences and structures. Bioinformatics, 2002, 18, 200-201.
Ivanisenko, V.A.; Pintus, S.S.; Grigorovich, D.A.; Kolchanov, N.A. PDBSite: A database of the 3D structure of protein functional sites. Nucleic Acids Res., 2005, 33, D183.
Gold, N.D.; Jackson, R.M. SitesBase: A database for structure-based protein-ligand binding site comparisons. Nucleic Acids Res., 2006, 34, 231-234.
Golovin, A.; Dimitropoulos, D.; Oldfield, T.; Rachedi, A.; Henrick, K. MSDsite: A database search and retrieval system for the analysis and viewing of bound ligands and active sites. Proteins, 2005, 58, 190-199.
Peter, B. A, S.C.; Ingo, D.; Gerhard, K. AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB. Nucleic Acids Res., 2006, 34, 522-526.
Dolado, I.; Swat, A.; Ajenjo, N.; Vita, G.D.; Cuadrado, A.; Nebreda, A.R. p38 MAP kinase as a sensor of reactive oxygen species in tumorigenesis. Cancer Cell, 2007, 11, 191-205.
Ceruti, S.; Villa, G.; Genovese, T.; Mazzon, E.; Longhi, R.; Rosa, P.; Bramanti, P.; Cuzzocrea, S.; Abbracchio, M.P. The P2Y-like receptor GPR17 as a sensor of damage and a new potential target in spinal cord injury. Brain. A J. Neurol., 2009, 132, 2206.
Kachel, P.; Trojanowicz, B.; Sekulla, C.; Prenzel, H.; Dralle, H.; Hoangvu, C. Phosphorylation of pyruvate kinase M2 and lactate dehydrogenase A by fibroblast growth factor receptor 1 in benign and malignant thyroid tissue. BMC Cancer, 2015, 15, 1-13.
Dogrul, A.; Gardell, L.R.; Ossipov, M.H.; Tulunay, F.C.; Lai, J.; Porreca, F. Reversal of experimental neuropathic pain by T-type calcium channel blockers. Pain, 2003, 105, 159-168.
Pisani, A.; Gubellini, P.; Bonsi, P.; Conquet, F.; Picconi, B.; Centonze, D.; Bernardi, G.; Calabresi, P. Metabotropic glutamate receptor 5 mediates the potentiation of N-methyl-D-aspartate responses in medium spiny striatal neurons. Neuroscience, 2001, 106, 579-587.
Xue, L.; Gyles, S.L.; Wettey, F.R.; Gazi, L.; Townsend, E.; Hunter, M.G.; Pettipher, R. Prostaglandin D2 causes preferential induction of proinflammatory Th2 cytokine production through an action on chemoattractant receptor-like molecule expressed on Th2 cells. J. Immunol., 2005, 175, 6531.
Molkentin, J.D.; Lu, J.R.; Antos, C.L.; Markham, B.; Richardson, J.; Robbins, J.; Grant, S.R.; Olson, E.N. A calcineurin-dependent transcriptional pathway for cardiac hypertrophy. Cell, 1998, 93, 215-228.
Qian, K.C.; Studts, A.J.; Wang, B.L.; Barringer, B.K.; Kronkaitis, B.A.; Peng, B.C.; Baptiste, B.A.; Lafrance, B.R.; Mische, B.S. A, B.F. Expression, purification, crystallization and preliminary crystallographic analysis of human Pim-1 kinase. Acta Crystallogr., 2010, 61, 96-99.
Hirono, Y.; Yoshimoto, T.; Suzuki, N.; Sugiyama, T.; Sakurada, M.; Takai, S.; Kobayashi, N.; Shichiri, M.; Hirata, Y. Angiotensin II receptor type 1-mediated vascular oxidative stress and proinflammatory gene expression in aldosterone-induced hypertension: the possible role of local renin-angiotensin system. Endocrinology, 2007, 148, 1688-1696.
Courtney, K.D.; Corcoran, R.B.; Engelman, J.A. The PI3K pathway as drug target in human cancer. J. Clin. Oncol., 2010, 28, 1075.
Marton, M.J.; Derisi, J.L.; Bennett, H.A.; Iyer, V.R.; Meyer, M.R.; Roberts, C.J.; Stoughton, R.; Burchard, J.; Slade, D.; Dai, H. Drug target validation and identification of secondary drug target effects using DNA microarrays. Tanpakushitsu Kakusan Koso Protein Nucleic Acid Enzyme, 2007, 52, 1808-1809.
Zhang, Y.L.; Shen, W.P.; Xie, Z.; Wang, L. Adenosine monophosphate affects competence development and plasmid DNA: Transformation in Escherichia coli. Curr. Microbiol., 2013, 67, 550-556.
Mueller, B.K.; Mack, H.; Teusch, N. Rho kinase, a promising drug target for neurological disorders. Nat. Rev. Drug Discov., 2005, 4, 387-398.
Chan, D.C.; Chutkowski, C.T.; Kim, P.S. Evidence that a prominent cavity in the coiled coil of HIV type 1 gp41 is an attractive drug target. PNAS, 1998, 95, 15613-15617.
Zhang, Y.N.; Zhang, W.; Hong, D.; Shi, L.; Shen, Q.; Li, J.Y.; Li, J.; Hu, L.H. Oleanolic acid and its derivatives: New inhibitor of protein tyrosine phosphatase 1B with cellular activities. Bioorg. Med. Chem., 2008, 16, 8697-8705.
Binda, C.; Newtonvinson, P.; Hubálek, F.; Edmondson, D.E.; Mattevi, A. Structure of human monoamine oxidase B, a drug target for the treatment of neurological disorders. Nat. Struct. Biol., 2001, 9, 22-26.
Bhat, R.V.; Budd Haeberlein, S.L.; Avila, J. Glycogen synthase kinase 3: A drug target for CNS therapies. J. Neurochem., 2010, 89, 1313-1317.
Hopkins, A.L. Drug discovery: Predicting promiscuity. Nature, 2009, 462, 167-8.
Hu, Y.; Zhou, M.; Shi, H.; Ju, H.; Jiang, Q.; Cheng, L. Measuring disease similarity and predicting disease-related ncRNAs by a novel method. BMC Med. Genomics, 2017, 10, 71.
Cheng, L.; Jiang, Y.; Wang, Z.; Shi, H.; Sun, J.; Yang, H.; Zhang, S.; Hu, Y.; Zhou, M. DisSim: an online system for exploring significant similar diseases and exhibiting potential therapeutic drugs. Sci. Rep., 2016, 6, 30024.
Cheng, L.; Sun, J.; Xu, W.; Dong, L.; Hu, Y.; Zhou, M. OAHG: An integrated resource for annotating human genes with multi-level ontologies. Sci. Rep., 2016, 10, 34820.
Jiang, Q.; Jin, S.; Jiang, Y.; Liao, M.; Feng, R.; Zhang, L.; Liu, G.; Hao, J. Alzheimers disease variants with the genome-wide significance are significantly enriched in immune pathways and active in immune cells. Mol. Neurobiol., 2017, 54, 594-600.
Liu, G.; Zhang, F.; Hu, Y.; Jiang, Y.; Gong, Z.; Liu, S.; Chen, X.; Jiang, Q.; Hao, J. Genetic variants and multiple sclerosis risk gene slc9a9 expression in distinct human brain regions. Mol. Neurobiol., 2017, 54, 6820-6826.
Hu, Y.; Zheng, L.; Cheng, L.; Zhang, Y.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jin, S.; Zhang, J.; Liu, G.; Jiang, Q. GAB2 rs2373115 variant contributes to Alzheimers disease risk specifically in European population. J. Neurol. Sci., 2017, 375, 18-22.
Hu, Y.; Cheng, L.; Zhang, Y.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jin, S.; Zhang, J.; Jiang, Q.; Liu, G. Rs4878104 contributes to Alzheimers disease risk and regulates DAPK1 gene expression. Neurol. Sci., 2017, 38, 1255-1262.
Peng, J.; Wang, H.; Lu, J.; Hui, W.; Wang, Y.; Shang, X. Identifying term relations cross different gene ontology categories. BMC Bioinformatics, 2017, 18, 573.
Peng, J.; Wang, T.; Wang, J.; Wang, Y.; Chen, J. Extending gene ontology with gene association networks. Bioinformatics, 2016, 32, 1185-94.
Peng, J.J.; Xue, H.S.; Shao, Y.K.; Shang, X.Q.; Wang, Y.D.; Chen, J. A novel method to measure the semantic similarity of HPO terms. Int. J. Data Min. Bioinform., 2017, 17, 173-188.
Liu, G.; Xu, Y.; Jiang, Y.; Zhang, L.; Feng, R.; Jiang, Q. PICALM rs3851179 variant confers susceptibility to Alzheimers disease in Chinese population. Mol. Neurobiol., 2017, 54, 3131-3136.
Liu, G.; Zhang, F.; Hu, Y.; Jiang, Y.; Gong, Z.; Liu, S.; Chen, X.; Jiang, Q.; Hao, J. Multiple sclerosis risk pathways differ in Caucasian and Chinese populations. J. Neuroimmunol., 2017, 307, 63-68.
Liu, G.; Zhang, F.; Jiang, Y.; Hu, Y.; Gong, Z.; Liu, S.; Chen, X.; Jiang, Q.; Hao, J. Integrating genome-wide association studies and gene expression data highlights dysregulated multiple sclerosis risk pathways. Mult. Scler., 2017, 23, 205-212.
Liu, G.; Zhang, Y.; Wang, L.; Xu, J.; Chen, X.; Bao, Y.; Hu, Y.; Jin, S.; Tian, R.; Bai, W.; Zhou, W.; Wang, T.; Han, Z.; Zong, J.; Jiang, Q. Alzheimers disease rs11767557 variant regulates EPHA1 gene expression specifically in human whole blood. J. Alzheimers Dis., 2017, 61.
Brehme, M.; Hantschel, O.; Colinge, J.; Kaupe, I.; Planyavsky, M.; Kcher, T.; Mechtler, K.; Bennett, K.L.; Supertifurga, G. Charting the molecular network of the drug target Bcr-Abl. PNAS, 2009, 106, 7414-7419.
Via, D.; Uriarte, E.; Orallo, F.; González-Díaz, H. Alignment-free prediction of a drug-target complex network based on parameters of drug connectivity and protein sequence of receptors. Mol. Pharm., 2009, 6, 825.
Cheng, F.; Liu, C.; Jiang, J.; Lu, W.; Li, W.; Liu, G.; Zhou, W.; Huang, J.; Tang, Y. Prediction of drug-target interactions and drug repositioning via network-based inference. PLOS Comput. Biol., 2012, 8, e1002503.
Csermely, P.; Agoston, V.; Pongor, S. The efficiency of multi-target drugs: the network approach might help drug design. Trends Pharmacol. Sci., 2005, 26, 178-182.
Huang, C.; Zhang, R.; Chen, Z.; Jiang, Y.; Shang, Z.; Sun, P.; Zhang, X.; Li, X. Predict potential drug targets from the ion channel proteins based on SVM. J. Theor. Biol., 2010, 262, 750-756.
Han, L.Y.; Zheng, C.J.; Xie, B.; Jia, J.; Ma, X.H.; Zhu, F.; Lin, H.H.; Chen, X.; Chen, Y.Z. Support vector machines approach for predicting druggable proteins: Recent progress in its exploration and investigation of its usefulness. Drug Discov. Today, 2007, 12, 304-313.
Li, Q.; Lai, L. Prediction of potential drug targets based on simple sequence properties. BMC Bioinformatics, 2007, 8, 353.
Zhao, Y.W.; Su, Z.D.; Yang, W.; Lin, H.; Chen, W.; Tang, H. IonchanPred 2.0: A tool to predict ion channels and their types. Int. J. Mol. Sci., 2017, 18, 1838.
Lin, H.; Ding, H. Predicting ion channels and their types by the dipeptide mode of pseudo amino acid composition. J. Theor. Biol., 2011, 269, 64.
Chen, W.; Lin, H. Identification of voltage-gated potassium channel subfamilies from sequence information using support vector machine. Comput. Biol. Med., 2012, 42, 504.
Chen, X.X.; Hua, T.; Li, W.C.; Hao, W.; Wei, C.; Hui, D.; Hao, L. Identification of bacterial cell wall lyases via pseudo amino acid composition. BioMed Res. Int., 2016, 2016, 1-8.
Yang, H.; Hua, T.; Chen, X.X.; Zhang, C.J.; Zhu, P.P.; Hui, D.; Wei, C.; Hao, L. Identification of secretory proteins in Mycobacterium tuberculosis using pseudo amino acid composition. BioMed Res. Int., 2016, 2016, 5413903.
Lai, H.Y.; Chen, X.X.; Chen, W.; Tang, H.; Lin, H. Sequence-based predictive modeling to identify cancerlectins. Oncotarget, 2017, 8, 28169-28175.
Ashrafi, E.; Alemzadeh, A.; Ebrahimi, M.; Ebrahimie, E.; Dadkhodaei, N.; Ebrahimi, M. Amino acid features of P1B-ATPase heavy metal transporters enabling small numbers of organisms to cope with heavy metal pollution. Bioinform. Biol. Insights, 2011, 2011, 59-82.
Ebrahimi, M.; Ebrahimie, E.; Shamabadi, N.; Ebrahimi, M. Are there any differences between features of proteins expressed in malignant and benign breast cancers? J. Res. Med. Sci., 2010, 15, 299-309.
Ebrahimi, M.; Lakizadeh, A.; Agha-Golzadeh, P.; Ebrahimie, E.; Ebrahimi, M. Prediction of thermostability from amino acid attributes by combination of clustering with attribute weighting: A new vista in engineering enzymes. PLoS One, 2011, 6, e23146.
Tahrokh, E.; Ebrahimi, M.; Ebrahimi, M.; Zamansani, F.; Sarvestani, N.R.; Mohammadi-Dehcheshmeh, M.; Ghaemi, M.R.; Ebrahimie, E. Comparative study of ammonium transporters in different organisms by study of a large number of structural protein features via data mining algorithms. Genes Genomics, 2011, 33, 565.
Zinati, Z.; Zamansani, F.; Kayvanjoo, A.H.; Ebrahimi, M.; Ebrahimi, M.; Ebrahimie, E.; Dehcheshmeh, M.M. New layers in understanding and predicting α-linolenic acid content in plants using amino acid characteristics of omega-3 fatty acid desaturase. Comput. Biol. Med., 2014, 54, 14-23.
Bakhtiarizadeh, M.R.; Moradi-Shahrbabak, M.; Ebrahimi, M.; Ebrahimie, E. Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology. J. Theor. Biol., 2014, 356, 213-222.
Delavari, A.; Zare, S.; Ghaemi, M.R.; Kashfi, R.; Ebrahimi, M.; Tahmasebi, A.; Ebrahimi, M.; Ebrahimie, E. Determining the structural amino acid attributes which are important in both protein thermostability and alkalophilicity: A case study on xylanase. Biotechnologia, 2014, 2, 161-173.
Kayvanjoo, A.H.; Ebrahimi, M.; Haqshenas, G. Prediction of hepatitis C virus interferon/ribavirin therapy outcome based on viral nucleotide attributes using machine learning algorithms. BMC Res. Notes, 2014, 7, 565.
Zhao, Y.W.; Lai, H.Y.; Hua, T.; Wei, C.; Hao, L. Prediction of phosphothreonine sites in human proteins by fusing different features. SC Rep., 2016, 6, 34817.
Hardy, L.W.; Peet, N.P. The multiple orthogonal tools approach to define molecular causation in the validation of druggable targets. Drug Discov. Today, 2004, 9, 117-126.
Yao, L.; Rzhetsky, A. Quantitative systems-level determinants of human genes targeted by successful drugs. Genome Res., 2008, 18, 206-213.
Costa, P.R.; Acencio, M.L.; Lemke, N. A machine learning approach for genome-wide prediction of morbid and druggable human genes based on systems-level data. BMC Genomics, 2010, 11, S9.
Kumari, P.; Nath, A.; Chaube, R. Identification of human drug targets using machine-learning algorithms. Comput. Biol. Med., 2015, 56, 175-181.
Jamali, A.A.; Ferdousi, R.; Razzaghi, S.; Li, J.; Safdari, R.; Ebrahimie, E. DrugMiner: Comparative analysis of machine learning algorithms for prediction of potential druggable proteins. Drug Discov. Today, 2016, 21, 718-724.
Jeon, J.; Nim, S.; Teyra, J.; Datti, A.; Wrana, J.L.; Sidhu, S.S.; Moffat, J.; Kim, P.M. A systematic approach to identify novel cancer drug targets using machine learning, inhibitor design and high-throughput screening. Genome Med., 2014, 6, 57.
Tang, H.; Su, Z.D.; Wei, H.H.; Chen, W.; Lin, H. Prediction of cell-penetrating peptides with feature selection techniques. Biochem. Biophys. Res. Commun., 2016, 477, 150-154.
(a) Feng, P.M.; Hao, L.; Wei, C. Identification of antioxidants from sequence information using naïve bayes. Comput. Math. Methods Med., 2013, 2013, 567529.
Feng, P.M.; Ding, H.; Chen, W.; Lin, H. naïve bayes classifier with feature selection to identify phage virion proteins. Comput. Math. Methods Med., 2013, 2013, 530696.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [209 - 216]
Pages: 8
DOI: 10.2174/1389200219666180925091851
Price: $58

Article Metrics

PDF: 29