Prediction of Drug Combinations with a Network Embedding Method

Author(s): Tianyun Wang, Lei Chen*, Xian Zhao.

Journal Name: Combinatorial Chemistry & High Throughput Screening

Volume 21 , Issue 10 , 2018

Submit Manuscript
Submit Proposal

Abstract:

Aim and Objective: There are several diseases having a complicated mechanism. For such complicated diseases, a single drug cannot treat them very well because these diseases always involve several targets and single targeted drugs cannot modulate these targets simultaneously. Drug combination is an effective way to treat such diseases. However, determination of effective drug combinations is time- and cost-consuming via traditional methods. It is urgent to build quick and cheap methods in this regard. Designing effective computational methods incorporating advanced computational techniques to predict drug combinations is an alternative and feasible way.

Method: In this study, we proposed a novel network embedding method, which can extract topological features of each drug combination from a drug network that was constructed using chemical-chemical interaction information retrieved from STITCH. These topological features were combined with individual features of drug combination reported in one previous study. Several advanced computational methods were employed to construct an effective prediction model, such as synthetic minority oversampling technique (SMOTE) that was used to tackle imbalanced dataset, minimum redundancy maximum relevance (mRMR) and incremental feature selection (IFS) methods that were adopted to analyze features and extract optimal features for building an optimal support machine vector (SVM) classifier.

Results and Conclusion: The constructed optimal SVM classifier yielded an MCC of 0.806, which is superior to the classifier only using individual features with or without SMOTE. The performance of the classifier can be improved by combining the topological features and essential features of a drug combination.

Keywords: Drug combination, network embedding method, minimum redundancy maximum relevance, synthetic minority oversampling technique, support machine vector.

[1]
Jia, J.; Zhu, F.; Ma, X.; Cao, Z.W.; Li, Y.X.; Chen, Y.Z. Mechanisms of drug combinations: Interaction and network perspectives. Nat. Rev. Drug Discov., 2009, 8(2), 111-128.
[2]
Lehár, J.; Krueger, A.S.; Avery, W.; Heilbut, A.M.; Johansen, L.M.; Price, E.R.; Rickles, R.J.; Short Iii, G.F.; Staunton, J.E.; Jin, X. Synergistic drug combinations tend to improve therapeutically relevant selectivity. Nat. Biotechnol., 2009, 27(7), 659-666.
[3]
Chou, T.C. Drug combination studies and their synergy quantification using the Chou-Talalay method. Cancer Res., 2010, 70(2), 440-446.
[4]
Zhao, X.M.; Iskar, M.; Zeller, G.; Kuhn, M.; van Noort, V.; Bork, P. Prediction of drug combinations by integrating molecular and pharmacological data. PLOS Comput. Biol., 2011, 7(12), e1002323.
[5]
Chen, L.; Li, B.Q.; Zheng, M.Y.; Zhang, J.; Feng, K.Y.; Cai, Y.D. Prediction of effective drug combinations by chemical interaction, protein interaction and target enrichment of KEGG pathways. BioMed Res. Int., 2013, 2013, 723780.
[6]
Sun, Y.; Xiong, Y.; Xu, Q.; Wei, D. A hadoop-based method to predict potential effective drug combination. BioMed Res. Int., 2014, 2014, 196858.
[7]
Wang, Y.Y.; Xu, K.J.; Song, J.; Zhao, X.M. Exploring drug combinations in genetic interaction network. BMC Bioinformatics, 2012, 13(Suppl. 7), S7.
[8]
Bai, L.Y.; Dai, H.; Xu, Q.; Junaid, M.; Peng, S.L.; Zhu, X.; Xiong, Y.; Wei, D.Q. Prediction of effective drug combinations by an improved naive bayesian algorithm. Int. J. Mol. Sci., 2018, 19(2), 467.
[9]
Shi, J.Y.; Li, J.X.; Gao, K.; Lei, P.; Yiu, S.M. Predicting combinative drug pairs towards realistic screening via integrating heterogeneous features. BMC Bioinformatics, 2017, 18(Suppl. 12), 409.
[10]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn., 1995, 20(3), 273-297.
[11]
Breiman, L. Random forests. Mach. Learn., 2001, 45(1), 5-32.
[12]
Kuhn, M.; von Mering, C.; Campillos, M.; Jensen, L.J.; Bork, P. STITCH: Interaction networks of chemicals and proteins. Nucleic Acids Res., 2007, 36(Suppl. 1), D684-D688.
[13]
Kuhn, M.; Szklarczyk, D.; Pletscher-Frankild, S.; Blicher, T.H.; von Mering, C.; Jensen, L.J.; Bork, P. STITCH 4: Integration of protein-chemical interactions with user data. Nucleic Acids Res., 2013, 42(D1), D401-D407.
[14]
Macropol, K.; Can, T.; Singh, A.K. RRW: Repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinformatics, 2009, 10, 283.
[15]
Kohler, S.; Bauer, S.; Horn, D.; Robinson, P.N. Walking the interactome for prioritization of candidate disease genes. The Am. J. Hum. Genet., 2008, 82(4), 949-958.
[16]
Peng, H.; Long, F.; Ding, C. Feature selection based on mutual information: Criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Pattern Anal. Mach. Intell., 2005, 27(8), 1226-1238.
[17]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res., 2002, 16, 321-357.
[18]
Liu, H.A.; Setiono, R. Incremental feature selection. Appl. Intell., 1998, 9(3), 217-230.
[19]
Hare, D.; Foster, T. The Orange Book: The Food and Drug Administration's advice on therapeutic equivalence. American pharmacy1990, NS30 (7), 35-37.
[20]
Chen, L.; Zeng, W.M.; Cai, Y.D.; Feng, K.Y.; Chou, K.C. Predicting anatomical therapeutic chemical (ATC) classification of drugs by integrating chemical-chemical interactions and similarities. PLoS One, 2012, 7(4), e35254.
[21]
Cheng, X.; Zhao, S.G.; Xiao, X.; Chou, K.C. iATC-mHyb: A hybrid multi-label classifier for predicting the classification of anatomical therapeutic chemicals. Oncotarget, 2017, 8(35), 58494-58503.
[22]
Chen, L.; Chu, C.; Zhang, Y-H.; Zheng, M-Y.; Zhu, L.; Kong, X.; Huang, T. Identification of drug-drug interactions using chemical interactions. Curr. Bioinform., 2017, 12(6), 526-534.
[23]
Nanni, L.; Brahnam, S. Multi-label classifier based on histogram of gradients for predicting the anatomical therapeutic chemical class/classes of a given compound. Bioinformatics, 2017, 33(18), 2837-2841.
[24]
Chen, L.; Liu, T.; Zhao, X. Inferring anatomical therapeutic chemical (ATC) class of drugs using shortest path and random walk with restart algorithms. BBA – Mol. Basis Dis., 2018, 1864(6, Part B), 2228-2240.
[25]
Liu, L.; Chen, L.; Zhang, Y.H.; Wei, L.; Cheng, S.; Kong, X.; Zheng, M.; Huang, T.; Cai, Y.D. Analysis and prediction of drug-drug interaction by minimum redundancy maximum relevance and incremental feature selection. J. Biomol. Struct. Dyn., 2017, 35(2), 312-329.
[26]
Kanehisa, M.; Furumichi, M.; Tanabe, M.; Sato, Y.; Morishima, K. KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res., 2017, 45(D1), D353-D361.
[27]
Kanehisa, M.; Goto, S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 2000, 28(1), 27-30.
[28]
Cho, H.; Berger, B.; Peng, J. Compact integration of multi-network topology for functional analysis of genes. Cell Syst., 2016, 3(6), 540-548.
[29]
Grover, A.; Leskovec, J. node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining ACM: San Francisco, California, USA,, 2016, pp. 855-864.
[30]
Chen, L.; Zhang, Y-H.; Zhang, Z.; Huang, T.; Cai, Y-D. Inferring novel tumor suppressor genes with a protein-protein interaction network and network diffusion algorithms. Mol. Ther. Methods Clin. Dev., 2018, 10, 57-67.
[31]
Li, J.; Chen, L.; Wang, S.; Zhang, Y.; Kong, X.; Huang, T.; Cai, Y-D. A computational method using the random walk with restart algorithm for identifying novel epigenetic factors. Mol. Genet. Genomics, 2018, 293(1), 293-301.
[32]
Li, L.; Wang, Y.; An, L.; Kong, X.; Huang, T. A network-based method using a random walk with restart algorithm and screening tests to identify novel genes associated with Meniere’s disease. PLoS One, 2017, 12(8), e0182592.
[33]
Yuan, F.; Lu, W. Prediction of potential drivers connecting different dysfunctional levels in lung adenocarcinoma via a protein–protein interaction network. Biochimica et Biophysica Acta (BBA) -. Molecular Basis of Disease, 2017, 1864(6, Part B), 2284-2293.
[34]
Zhang, J.; Suo, Y.; Liu, M.; Xu, X. Identification of genes related to proliferative diabetic retinopathy through RWR algorithm based on protein–protein interaction network. Biochimica et Biophysica Acta (BBA) – Mol.r Basis Dis.,2017, 1864(6, Part B), 2369-2375.
[35]
Witten, I.H.; Frank, E. Data Mining:Practical Machine Learning Tools and Techniques; San Francisco: Morgan, Kaufmann, 2005.
[36]
Chen, L.; Pan, X.; Hu, X.; Zhang, Y-H.; Wang, S.; Huang, T.; Cai, Y-D. Gene expression differences among different MSI statuses in colorectal cancer. Int. J. Cancer, 2018, 143(7), 1731-1740.
[37]
Chen, L.; Wang, S.; Zhang, Y-H.; Wei, L.; Xu, X.; Huang, T.; Cai, Y-D. Prediction of nitrated tyrosine residues in protein sequences by extreme learning machine and feature selection methods. Comb. Chem. High Throughput Screen., 2018, 21(6), 393-402.
[38]
Wang, S.; Wang, D.; Li, J.; Huang, T.; Cai, Y-D. Identification and analysis of the cleavage site in a signal peptide using SMOTE, dagging, and feature selection methods. Mol. Omics, 2018, 14(1), 64-73.
[39]
Wang, S.; Zhang, Y.H.; Zhang, N.; Chen, L.; Huang, T.; Cai, Y.D. Recognizing and predicting thioether bridges formed by lanthionine and beta-methyllanthionine in lantibiotics using a random forest approach with feature selection. Comb. Chem. High Throughput Screen., 2017, 20(7), 582-593.
[40]
Chen, L.; Wang, S.; Zhang, Y-H.; Li, J.; Xing, Z-H.; Yang, J.; Huang, T.; Cai, Y-D. Identify key sequence features to improve CRISPR sgRNA efficacy. IEEE Access, 2017, 5, 26582-26590.
[41]
Li, J.; Lu, L.; Zhang, Y.; Liu, M.; Chen, L.; Huang, T.; Cai, Y-D. Identification of synthetic lethality based on a functional network by using machine learning algorithms. J. Cell. Biochem., 2018, 120(1), 405-416.
[42]
Zhao, X.; Chen, L.; Lu, J. A similarity-based method for prediction of drug side effects with heterogeneous information. Math. Biosci., 2018, 306, 136-144.
[43]
Kohavi, R. In: A study of cross-validation and bootstrap for accuracy estimation and model selection, International joint Conference on artificial intelligence, Lawrence Erlbaum Associates Ltd: 1995, pp. 1137-1145.
[44]
Chen, L.; Li, J.; Zhang, Y.H.; Feng, K.; Wang, S.; Zhang, Y.; Huang, T.; Kong, X.; Cai, Y.D. Identification of gene expression signatures across different types of neural stem cells with the Monte-Carlo feature selection method. J. Cell. Biochem., 2018, 119(4), 3394-3403.
[45]
Zhang, Y.H.; Huang, T.; Chen, L.; Xu, Y.; Hu, Y.; Hu, L.D.; Cai, Y.; Kong, X. Identifying and analyzing different cancer subtypes using RNA-seq data of blood platelets. Oncotarget, 2017, 8(50), 87494-87511.
[46]
Chen, L.; Zhang, Y-H.; Huang, T.; Cai, Y-D. Gene expression profiling gut microbiota in different races of humans. Sci. Rep., 2016, 6, 23075.
[47]
Ahmed, F.; Kaundal, R.; Raghava, G.P. PHDcleav: A SVM based method for predicting human Dicer cleavage sites using sequence and secondary structure of miRNA precursors. BMC Bioinformatics, 2013, 14(Suppl. 14), S9.
[48]
Wang, S.; Zhang, Q.; Lu, J.; Cai, Y-D. Analysis and prediction of nitrated tyrosine sites with the mRMR method and support vector machine algorithm. Curr. Bioinform., 2018, 13(1), 3-13.
[49]
Wang, S.; Cai, Y. Identification of the functional alteration signatures across different cancer types with support vector machine and feature analysis. Biochimica et Biophysica Acta (BBA) –. Mol. Basis Dis., 2018, 1864(6, Part B), 2218-2227.
[50]
Platt, J. Sequential Minimal Optimizaton: A Fast Algorithm for Training Support Vector Machines. Tech. Rep., MSR-TR-98-141998.
[51]
Matthews, B. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-. Protein Structure, 1975, 405(2), 442-451.
[52]
Egan, J. Signal Detection Theory and ROC Analysis; Academic Press: New York, 1975.
[53]
Zhang, Q.; Sun, X.; Feng, K.; Wang, S.; Zhang, Y.H.; Wang, S.; Lu, L.; Cai, Y.D. Predicting citrullination sites in protein sequences using mRMR method and random forest algorithm. Comb. Chem. High Throughput Screen., 2017, 20(2), 164-173.
[54]
Ma, L.; Fan, S. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests. BMC Bioinformatics, 2017, 18(1), 169.
[55]
Liu, B.; Long, R.; Chou, K.C. iDHS-EL: Identifying DNase I hypersensitive sites by fusing three different modes of pseudo nucleotide composition into an ensemble learning framework. Bioinformatics, 2016, 32(16), 2411-2418.
[56]
Khan, S.; Naseem, I.; Togneri, R.; Bennamoun, M. RAFP-Pred: Robust prediction of antifreeze proteins using localized analysis of n-peptide compositions. IEEE/ACM Trans. Comput. Biol. Bioinformatics, 2018, 15(1), 244-250.
[57]
Pan, X.Y.; Zhu, L.; Fan, Y.X.; Yan, J.C. Predicting protein-RNA interaction amino acids using random forest based on submodularity subset selection. Comput. Biol. Chem., 2014, 53, 324-330.


Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 21
ISSUE: 10
Year: 2018
Page: [789 - 797]
Pages: 9
DOI: 10.2174/1386207322666181226170140
Price: $58

Article Metrics

PDF: 20
HTML: 2
EPUB: 1