A Drug Decision Support System for Developing a Successful Drug Candidate Using Machine Learning Techniques

Author(s): Aytun Onay*, Melih Onay

Journal Name: Current Computer-Aided Drug Design

Volume 16 , Issue 4 , 2020

Become EABM
Become Reviewer
Call for Editor

Graphical Abstract:


Background: Virtual screening of candidate drug molecules using machine learning techniques plays a key role in pharmaceutical industry to design and discovery of new drugs. Computational classification methods can determine drug types according to the disease groups and distinguish approved drugs from withdrawn ones.

Introduction: Classification models developed in this study can be used as a simple filter in drug modelling to eliminate potentially inappropriate molecules in the early stages. In this work, we developed a Drug Decision Support System (DDSS) to classify each drug candidate molecule as potentially drug or non-drug and to predict its disease group.

Methods: Molecular descriptors were identified for the determination of a number of rules in drug molecules. They were derived using ADRIANA.Code program and Lipinski's rule of five. We used Artificial Neural Network (ANN) to classify drug molecules correctly according to the types of diseases. Closed frequent molecular structures in the form of subgraph fragments were also obtained with Gaston algorithm included in ParMol Package to find common molecular fragments for withdrawn drugs.

Results: We observed that TPSA, XlogP Natoms, HDon_O and TPSA are the most distinctive features in the pool of the molecular descriptors and evaluated the performances of classifiers on all datasets and found that classification accuracies are very high on all the datasets. Neural network models achieved 84.6% and 83.3% accuracies on test sets including cardiac therapy, anti-epileptics and anti-parkinson drugs with approved and withdrawn drugs for drug classification problems.

Conclusion: The experimental evaluation shows that the system is promising at determination of potential drug molecules to classify drug molecules correctly according to the types of diseases.

Keywords: Drug design, molecular descriptors, artificial neural network, ADRIANA.Code, data mining, frequent subgraph mining.

Zheng, M.; Liu, X.; Xu, Y.; Li, H.; Luo, C.; Jiang, H. Computational methods for drug design and discovery: focus on China. Trends Pharmacol. Sci., 2013, 34(10), 549-559.
[http://dx.doi.org/10.1016/j.tips.2013.08.004] [PMID: 24035675]
Zhang, M.Q.; Wilkinson, B. Drug discovery beyond the ‘rule-of-five’. Curr. Opin. Biotechnol., 2007, 18(6), 478-488.
[http://dx.doi.org/10.1016/j.copbio.2007.10.005] [PMID: 18035532]
Lavecchia, A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov. Today, 2015, 20(3), 318-331.
[http://dx.doi.org/10.1016/j.drudis.2014.10.012] [PMID: 25448759]
Drews, J. Drug discovery: a historical perspective. Science, 2000, 287(5460), 1960-1964.
[http://dx.doi.org/10.1126/science.287.5460.1960] [PMID: 10720314]
Evens, R.P. Drug and biological development; Springer: USA, 2007.
Reddy, A.S.; Pati, S.P.; Kumar, P.P.; Pradeep, H.N.; Sastry, G.N. Virtual screening in drug discovery -- a computational perspective. Curr. Protein Pept. Sci., 2007, 8(4), 329-351.
[http://dx.doi.org/10.2174/138920307781369427] [PMID: 17696867]
Clark, D.E.; Pickett, S.D. Computational methods for the prediction of ‘drug-likeness’. Drug Discov. Today, 2000, 5(2), 49-58.
[http://dx.doi.org/10.1016/S1359-6446(99)01451-8] [PMID: 10652455]
Yusof, I.; Segall, M.D. Considering the impact drug-like properties have on the chance of success. Drug Discov. Today, 2013, 18(13-14), 659-666.
[http://dx.doi.org/10.1016/j.drudis.2013.02.008] [PMID: 23458995]
Lavecchia, A.; Di Giovanni, C. Virtual screening strategies in drug discovery: a critical review. Curr. Med. Chem., 2013, 20(23), 2839-2860.
[http://dx.doi.org/10.2174/09298673113209990001] [PMID: 23651302]
Ekins, S.; Shimada, J.; Chang, C. Application of data mining approaches to drug delivery. Adv. Drug Deliv. Rev., 2006, 58(12-13), 1409-1430.
[http://dx.doi.org/10.1016/j.addr.2006.09.005] [PMID: 17081647]
Wang, Y.; Xing, J.; Xu, Y.; Zhou, N.; Peng, J.; Xiong, Z.; Liu, X.; Luo, X.; Luo, C.; Chen, K.; Zheng, M.; Jiang, H. In silico ADME/T modelling for rational drug design. Q. Rev. Biophys., 2015, 48(4), 488-515.
[http://dx.doi.org/10.1017/S0033583515000190] [PMID: 26328949]
Hou, T.; Wang, J.; Li, Y. ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine. J. Chem. Inf. Model., 2007, 47(6), 2408-2415.
[http://dx.doi.org/10.1021/ci7002076] [PMID: 17929911]
Ferreira, L.L.G.; Andricopulo, A.D. ADMET modeling approaches in drug discovery. Drug Discov. Today, 2019, 24(5), 1157-1165.
[http://dx.doi.org/10.1016/j.drudis.2019.03.015] [PMID: 30890362]
Yusof, I.; Shah, F.; Hashimoto, T.; Segall, M.D.; Greene, N. Finding the rules for successful drug optimisation. Drug Discov. Today, 2014, 19(5), 680-687.
[http://dx.doi.org/10.1016/j.drudis.2014.01.005] [PMID: 24451293]
Lipinski, C.A.; Lombardo, F.; Dominy, B.W.; Feeney, P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev., 2001, 46(1-3), 3-26.
[http://dx.doi.org/10.1016/S0169-409X(00)00129-0] [PMID: 11259830]
Lipinski, C.A.; Lombardo, F.; Dominy, B.W.; Feeney, P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev., 2001, 46(1-3), 3-26.
[http://dx.doi.org/10.1016/S0169-409X(00)00129-0] [PMID: 11259830]
Lipinski, C.A. Drug-like properties and the causes of poor solubility and poor permeability. J. Pharmacol. Toxicol. Methods, 2000, 44(1), 235-249.
[http://dx.doi.org/10.1016/S1056-8719(00)00107-6] [PMID: 11274893]
Liao, S.; Chu, P.; Hsiao, P. Data mining techniques and applications – A decade review from 2000 to 2011. Expert Syst. Appl., 2012, 39(12), 11303-11311.
Chen, B.; Harrison, R.F.; Papadatos, G.; Willett, P.; Wood, D.J.; Lewell, X.Q.; Greenidge, P.; Stiefl, N. Evaluation of machine-learning methods for ligand-based virtual screening. J. Comput. Aided Mol. Des., 2007, 21(1-3), 53-62.
[http://dx.doi.org/10.1007/s10822-006-9096-5] [PMID: 17205373]
Hand, D. Principles of Data Mining; MIT Press, 2001.
Witten, I.H.; Frank, E. Data Mining: Practical MachineLearning Tools and Techniques 2nd ed Morgan KaufmannPublishers 2005 San Francisco, CA,.
Zhang, L.; Tan, J.; Han, D.; Zhu, H. From machine learning to deep learning: progress in machine intelligence for rational drug discovery. Drug Discov. Today, 2017, 22(11), 1680-1685.
[http://dx.doi.org/10.1016/j.drudis.2017.08.010] [PMID: 28881183]
Panteleev, J.; Gao, H.; Jia, L. Recent applications of machine learning in medicinal chemistry. Bioorg. Med. Chem. Lett., 2018, 28(17), 2807-2815.
[http://dx.doi.org/10.1016/j.bmcl.2018.06.046] [PMID: 30122222]
Liu, Y. A comparative study on feature selection methods for drug discovery. J. Chem. Inf. Comput. Sci., 2004, 44(5), 1823-1828.
[http://dx.doi.org/10.1021/ci049875d] [PMID: 15446842]
Guyon, I.; Elisseeff, A. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res., 2003, 3, 1157-1182.
Sok, H.K.; Ooi, M.P.; Kuang, Y.C.; Demidenko, S. Multivariate alternating decision trees. Pattern Recognit., 2016, 50, 195-209.
Cano, G.; Rodriguez, J.G.; Garcia, A.G.; Sanchez, H.P.; Benediktsson, J.A.; Thapa, A.; Barr, A. Automatic selection of molecular descriptors using random forest: Application to drug discovery. Expert Syst. Appl., 2017, 72, 151-159.
Xue, Y.; Li, Z.R.; Yap, C.W.; Sun, L.Z.; Chen, X.; Chen, Y.Z. Effect of molecular descriptor feature selection in support vector machine classification of pharmacokinetic and toxicological properties of chemical agents. J. Chem. Inf. Comput. Sci., 2004, 44(5), 1630-1638.
[http://dx.doi.org/10.1021/ci049869h] [PMID: 15446820]
Dancey, D. Tree Based Methods for Rule Extraction from Artifical Neural Networks. Published PhD Thesis, Manchester Metropolitan University; United Kingtom, 2008.
Dancey, D.; Bandar, Z.A.; McLean, D. Logistic model tree extraction from artificial neural networks. IEEE Trans. Syst. Man Cybern. B Cybern., 2007, 37(4), 794-802.
[http://dx.doi.org/10.1109/TSMCB.2007.895334] [PMID: 17702280]
Kauffman, G.W.; Jurs, P.C. QSAR and k-nearest neighbor classification analysis of selective cyclooxygenase-2 inhibitors using topologically-based numerical descriptors. J. Chem. Inf. Comput. Sci., 2001, 41(6), 1553-1560.
[http://dx.doi.org/10.1021/ci010073h] [PMID: 11749582]
Bajorath, J. Selected concepts and investigations in compound classification, molecular descriptor analysis, and virtual screening. J. Chem. Inf. Comput. Sci., 2001, 41(2), 233-245.
[http://dx.doi.org/10.1021/ci0001482] [PMID: 11277704]
Khan, G.M. Cardiac Drug Therapy, 7th ed; Totowa, New Jersey, 2007.
Anderson, J.; Moor, C.C. Anti-epileptic drugs: a guide for the non-neurologist. Clin. Med. (Lond.), 2010, 10(1), 54-58.
[http://dx.doi.org/10.7861/clinmedicine.10-1-54] [PMID: 20408309]
Garbayo, E.; Ansorena, E.; Blanco-Prieto, M.J. Drug development in Parkinson’s disease: from emerging molecules to innovative drug delivery systems. Maturitas, 2013, 76(3), 272-278.
[http://dx.doi.org/10.1016/j.maturitas.2013.05.019] [PMID: 23827471]
D’Andrea, G.; Nordera, G.; Pizzolato, G.; Bolner, A.; Colavito, D.; Flaibani, R.; Leon, A. Trace amine metabolism in Parkinson’s disease: low circulating levels of octopamine in early disease stages. Neurosci. Lett., 2010, 469(3), 348-351.
[http://dx.doi.org/10.1016/j.neulet.2009.12.025] [PMID: 20026245]
McNaughton, R.; Huet, G.; Shakir, S. An investigation into drug products withdrawn from the EU market between 2002 and 2011 for safety reasons and the evidence used to support the decision-making. BMJ Open, 2014, 4(1)e004221
[http://dx.doi.org/10.1136/bmjopen-2013-004221] [PMID: 24435895]
Siramshetty, V.B.; Nickel, J.; Omieczynski, C.; Gohlke, B.O.; Drwal, M.N.; Preissner, R. WITHDRAWN--a resource for withdrawn and discontinued drugs. Nucleic Acids Res., 2016, 44(D1), D1080-D1086.
[http://dx.doi.org/10.1093/nar/gkv1192] [PMID: 26553801]
Patel, J.; Chaudhari, C. Introduction to the artificial neural networks and their applications in QSAR studies. ALTEX, 2005, 22, 271.
Meinl, T.; Wrlein, M.; Urzova, O.; Fischer, I.; Philippsen, M. The parmol package for frequent subgraph mining; ECEASST, 2006, p. 1.
Takigawa, I.; Mamitsuka, H. Graph mining: procedure, application to drug discovery and recent advances. Drug Discov. Today, 2013, 18(1-2), 50-57.
[http://dx.doi.org/10.1016/j.drudis.2012.07.016] [PMID: 22889967]
Kabari, L.G.; Nwachukwu, E.O. Neural Networks and Decision Trees For Eye Diseases Diagnosis; Advance in Expert Systems, 2012, pp. 63-84.
Anooj, P.K. Clinical decision support system: risk level prediction of heart disease using weighted fuzzy rules and decision tree rules. Central European Journal of Computer Science., 2011, 1(4), 482-498.
Tsipouras, M.G.; Exarchos, T.P.; Fotiadis, D.I.; Kotsia, A.P.; Vakalis, K.V.; Naka, K.K.; Michalis, L.K. Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling. IEEE Trans. Inf. Technol. Biomed., 2008, 12(4), 447-458.
[http://dx.doi.org/10.1109/TITB.2007.907985] [PMID: 18632325]
Fliri, A.F.; Loging, W.T.; Thadeio, P.F.; Volkmann, R.A. Analysis of drug-induced effect patterns to link structure and side effects of medicines. Nat. Chem. Biol., 2005, 1(7), 389-397.
[http://dx.doi.org/10.1038/nchembio747] [PMID: 16370374]
Burbidge, R.; Trotter, M.; Buxton, B.; Holden, S. Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput. Chem., 2001, 26(1), 5-14.
[http://dx.doi.org/10.1016/S0097-8485(01)00094-8] [PMID: 11765851]
Korkmaz, S.; Zararsiz, G.; Goksuluk, D. Drug/nondrug classification using Support Vector Machines with various feature selection strategies. Comput. Methods Programs Biomed., 2014, 117(2), 51-60.
[http://dx.doi.org/10.1016/j.cmpb.2014.08.009] [PMID: 25224081]
Vasundhara Devi, R.; Siva Sathya, S.; Selvaraj Coumarb, M. Evolutionary algorithms for de novo drug design – A survey. Appl. Soft Comput., 2015, 27, 543-552.
Amasyalı, M.F. Yeni Makine Öğrenmesi Metotları ve İlaç Tasarımına Uygulamaları; Doktora Tezi, Yıldız Teknik Üniversitesi: İstanbul, 2008.
Cao, G.P.; Thangapandian, S.; John, S.; Lee, K.W. Classification of HDAC8 Inhibitors and Non-Inhibitors Using Support Vector Machines. Interdisciplinary Bio Central., 2012, 4, 1-7.
Sussan, S.; Dagan, A.; Bialer, M. Pharmacokinetic analysis and anticonvulsant activity of glycine and glycinamide derivatives. Epilepsy Res., 1999, 33(1), 11-21.
[http://dx.doi.org/10.1016/S0920-1211(98)00076-X] [PMID: 10022362]
Burges, C.J.C. A tutorial on support vector machines for pattern recognition, data mining and knowledge discovery. Kluwer Academic Publishers. Epilepsy Res., 1998, 2(121), 167.
Geppert, H.; Vogt, M.; Bajorath, J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J. Chem. Inf. Model., 2010, 50(2), 205-216.
[http://dx.doi.org/10.1021/ci900419k] [PMID: 20088575]
Lo, Y.C.; Rensi, S.E.; Torng, W.; Altman, R.B. Machine learning in chemoinformatics and drug discovery. Drug Discov. Today, 2018, 23(8), 1538-1546.
[http://dx.doi.org/10.1016/j.drudis.2018.05.010] [PMID: 29750902]
Dimitri, G.M.; Lió, P. DrugClust: A machine learning approach for drugs side effects prediction. Comput. Biol. Chem., 2017, 68, 204-210.
[http://dx.doi.org/10.1016/j.compbiolchem.2017.03.008] [PMID: 28391063]
Mitchell, J.B.O. Machine learning methods in chemoinformatics. Wiley Interdiscip. Rev. Comput. Mol. Sci., 2014, 4(5), 468-481.
[http://dx.doi.org/10.1002/wcms.1183] [PMID: 25285160]
Sneader, W. Drug discovery a history John Wiley & Sons Ltd.. Wiley Interdiscip. Rev. Comput. Mol. Sci., 2005.
Vogel, H.G.; Maas, J.; Hock, F.J.; Mayer, D. Drug Discovery and Evaluation: Safety and Pharmacokinetic Assays. Heidelberg Wiley Interdiscip. Rev. Comput. Mol. Sci., 2013, Second Edition, Springer.
Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The rise of deep learning in drug discovery. Drug Discov. Today, 2018, 23(6), 1241-1250.
[http://dx.doi.org/10.1016/j.drudis.2018.01.039] [PMID: 29366762]
Ordás, D.R.; Yevseyeva, I.; Fernandes, V.B.; Méndez, J.R.; Emmerich, M.T.M. Improving the drug discovery process by using multiple classifier systems. Expert Syst. Appl., 2019, 121, 292-303.
Bouckaert, R.R.; Frank, E.; Hall, M.; Kirkby, R.; Reutemann, P. WEKA Manual for Version 3-7-13; University of Waikato: Hamilton, New Zealand, 2015.
Nijssen, S.; Kok, J.N. A quickstart in frequent structure mining can make a difference. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, 2004.
Amrutkar, S.N.; Shinde, J.V. A Review on Graph-based Image Classification. International Journal of Emerging Technologies in Computational and Applied Sciences., 2014, 8(1), 43-51.
Garcia-Serna, R.; Vidal, D.; Remez, N.; Mestres, J. Large-Scale Predictive Drug Safety: From Structural Alerts to Biological Mechanisms. Chem. Res. Toxicol., 2015, 28(10), 1875-1887.
[http://dx.doi.org/10.1021/acs.chemrestox.5b00260] [PMID: 26360911]
Huang, L.C.; Wu, X.; Chen, J.Y. Predicting adverse side effects of drugs. BMC Genomics, 2011, 12(Suppl. 5), S11.
[http://dx.doi.org/10.1186/1471-2164-12-S5-S11] [PMID: 22369493]
Onakpoya, I.J.; Heneghan, C.J.; Aronson, J.K. Delays in the post-marketing withdrawal of drugs to which deaths have been attributed: a systematic investigation and analysis. BMC Med., 2015, 13, 26.
[http://dx.doi.org/10.1186/s12916-014-0262-7] [PMID: 25651859]
Pauwels, E.; Stoven, V.; Yamanishi, Y. Predicting drug side-effect profiles: a chemical fragment-based approach. BMC Bioinformatics, 2011, 12, 169.
[http://dx.doi.org/10.1186/1471-2105-12-169] [PMID: 21586169]
von Korff, M.; Sander, T. Toxicity-indicating structural patterns. J. Chem. Inf. Model., 2006, 46(2), 536-544.
[http://dx.doi.org/10.1021/ci050358k] [PMID: 16562981]
Zhang, W.; Zou, H.; Luo, L.; Liu, Q.; Wu, W.; Xiao, W. Predicting potential side effects of drugs by recommender methods and ensemble learning. Neurocomputing, 2016, 173, 979-987.
Freire, E. Thermodynamics Guide to Affinity Optimization of Drug Candidate. Protein Reviews; ed. J.E. Ladbury, 2016, Vol 3 New York-Kluwer/Plenum.
Arnaiz, J.A.; Carné, X.; Riba, N.; Codina, C.; Ribas, J.; Trilla, A. The use of evidence in pharmacovigilance. Case reports as the reference source for drug withdrawals. Eur. J. Clin. Pharmacol., 2001, 57(1), 89-91.
[http://dx.doi.org/10.1007/s002280100265] [PMID: 11372600]
Clarke, A.; Deeks, J.J.; Shakir, S.A. An assessment of the publicly disseminated evidence of safety used in decisions to withdraw medicinal products from the UK and US markets. Drug Saf., 2006, 29(2), 175-181.
[http://dx.doi.org/10.2165/00002018-200629020-00008] [PMID: 16454545]
Olivier, P.; Montastruc, J.L. The nature of the scientific evidence leading to drug withdrawals for pharmacovigilance reasons in France. Pharmacoepidemiol. Drug Saf., 2006, 15(11), 808-812.
[http://dx.doi.org/10.1002/pds.1248] [PMID: 16700082]
Onakpoya, I.J.; Heneghan, C.J.; Aronson, J.K. Post-marketing withdrawal of 462 medicinal products because of adverse drug reactions: a systematic review of the world literature. BMC Med., 2016, 14, 10.
[http://dx.doi.org/10.1186/s12916-016-0553-2] [PMID: 26843061]
Vallano, A.; Cereza, G.; Pedròs, C.; Agustí, A.; Danés, I.; Aguilera, C.; Arnau, J.M. Obstacles and solutions for spontaneous reporting of adverse drug reactions in the hospital. Br. J. Clin. Pharmacol., 2005, 60(6), 653-658.
[http://dx.doi.org/10.1111/j.1365-2125.2005.02504.x] [PMID: 16305591]
Al-Lahham, S.H.; Peppelenbosch, M.P.; Roelofsen, H.; Vonk, R.J.; Venema, K. Biological effects of propionic acid in humans; metabolism, potential applications and underlying mechanisms. Biochim. Biophys. Acta, 2010, 1801(11), 1175-1183.
[http://dx.doi.org/10.1016/j.bbalip.2010.07.007] [PMID: 20691280]

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2020
Published on: 02 September, 2020
Page: [407 - 419]
Pages: 13
DOI: 10.2174/1573409915666190716143601
Price: $65

Article Metrics

PDF: 28