Generic placeholder image

Current Pharmaceutical Design


ISSN (Print): 1381-6128
ISSN (Online): 1873-4286

Research Article

Prediction of K562 Cells Functional Inhibitors Based on Machine Learning Approaches

Author(s): Yuan Zhang, Zhenyan Han, Qian Gao, Xiaoyi Bai, Chi Zhang* and Hongying Hou*

Volume 25 , Issue 40 , 2019

Page: [4296 - 4302] Pages: 7

DOI: 10.2174/1381612825666191107092214

Price: $65


Background: β thalassemia is a common monogenic genetic disease that is very harmful to human health. The disease arises is due to the deletion of or defects in β-globin, which reduces synthesis of the β-globin chain, resulting in a relatively excess number of α-chains. The formation of inclusion bodies deposited on the cell membrane causes a decrease in the ability of red blood cells to deform and a group of hereditary haemolytic diseases caused by massive destruction in the spleen.

Methods: In this work, machine learning algorithms were employed to build a prediction model for inhibitors against K562 based on 117 inhibitors and 190 non-inhibitors.

Results: The overall accuracy (ACC) of a 10-fold cross-validation test and an independent set test using Adaboost were 83.1% and 78.0%, respectively, surpassing Bayes Net, Random Forest, Random Tree, C4.5, SVM, KNN and Bagging.

Conclusion: This study indicated that Adaboost could be applied to build a learning model in the prediction of inhibitors against K526 cells.

Keywords: Machine learning, cross-validation test, independent set test, Adaboost; feature selection, K526 cells.

Giardine B, Borg J, Viennas E, et al. Updates of the HbVar database of human hemoglobin variants and thalassemia mutations. Nucleic Acids Res 2014; 42(Database issue): D1063-9.
[] [PMID: 24137000]
Old JM. Screening and genetic diagnosis of haemoglobin disorders. Blood Rev 2003; 17(1): 43-53.
[] [PMID: 12490210]
Watanapokasin R, Sanmund D, Winichagoon P, Muta K, Fucharoen S. Hydroxyurea responses and fetal hemoglobin induction in beta-thalassemia/HbE patients’ peripheral blood erythroid cell culture. Ann Hematol 2006; 85(3): 164-9.
[] [PMID: 16389564]
Kohli-Kumar M, Marandi H, Keller MA, Guertin K, Hvizdala E. Use of hydroxyurea and recombinant erythropoietin in management of homozygous beta0 thalassemia. J Pediatr Hematol Oncol 2002; 24(9): 777-8.
[] [PMID: 12468925]
De Franceschi L, Beuzard Y, Jouault H, Brugnara C. Modulation of erythrocyte potassium chloride cotransport, potassium content, and density by dietary magnesium intake in transgenic SAD mouse. Blood 1996; 88(7): 2738-44.
[] [PMID: 8839870]
Olivieri NF, Rees DC, Ginder GD, et al. Treatment of thalassaemia major with phenylbutyrate and hydroxyurea. Lancet 1997; 350(9076): 491-2.
[] [PMID: 9274590]
McDonagh KT, Dover GJ, Donahue RE, et al. Hydroxyurea-induced HbF production in anemic primates: augmentation by erythropoietin, hematopoietic growth factors, and sodium butyrate. Exp Hematol 1992; 20(10): 1156-64.
[PMID: 1385194]
Macari ER, Lowrey CH. Induction of human fetal hemoglobin via the NRF2 antioxidant response signaling pathway. Blood 2011; 117(22): 5987-97.
[] [PMID: 21464371]
Witt O, Monkemeyer S, Rönndahl G, et al. Induction of fetal hemoglobin expression by the histone deacetylase inhibitor apicidin. Blood 2003; 101(5): 2001-7.
[] [PMID: 12393499]
Hu Y, Lu Y, Wang S, et al. Application of Machine Learning Approaches for the design and study of anticancer drugs. Curr Drug Targets 2019; 20(5): 488-500.
[PMID: 30091413]
Zhao M, Wang L, Zheng L, et al. 2D-QSAR and 3D-QSAR Analyses for EGFR inhibitors. BioMed Res Int 2017; 2017 4649191
[] [PMID: 28630865]
Niu B, Zhao M, Su Q, et al. 2D-SAR and 3D-QSAR analyses for acetylcholinesterase inhibitors. Mol Divers 2017; 21(2): 413-26.
[] [PMID: 28275924]
Niu B, Zhang M, Du P, et al. Small molecular floribundiquinone B derived from medicinal plants inhibits acetylcholinesterase activity. Oncotarget 2017; 8(34): 57149-62.
[] [PMID: 28915661]
Niu B, Li J, Li G, Poon S, Harrington PB. Analysis and modeling for big data in cancer research. BioMed Res Int 2017; 20171972097
[] [PMID: 28691016]
Zhang C, Wang X, Gu L, et al. Prediction of an interaction between bakuchiol and acetylcholinesterase using adaboost. Curr Bioinform 2016; 11(1): 79-86.
Niu B, Xing Z, Zhao M, et al. Study of drug-drug combinations based on molecular descriptors and physicochemical properties. Comb Chem High Throughput Screen 2016; 19(2): 153-60.
[] [PMID: 26552439]
Bhola A, Singh S. Gene selection using high dimensional gene expression data: an appraisal. Curr Bioinform 2018; 13(3): 225-33.
Du X, Li X, Li W, et al. Identification and analysis of cancer diagnosis using probabilistic classification vector machines with feature selection. Curr Bioinform 2018; 13(6): 625-32.
Kumar N. Md. Hoque A, Md. Shahjaman, et al. A new approach of outlier-robust missing value imputation for metabolomics data analysis. Curr Bioinform 2019; 14(1): 43-52.
Liao Z, Wan S, He Y, et al. Classification of small GTPases with hybrid protein features and advanced machine learning techniques. Curr Bioinform 2018; 13(5): 492-500.
Naseem I, Khan S, Togneri R, Bennamoun M. ECMSRC: a sparse learning approach for the prediction of extracellular matrix proteins. Curr Bioinform 2017; 12(4): 361-8.
Özkan A, Belgin İşgör SB, Şengül G, İşgör YG, et al. Benchmarking classification models for cell viability on novel cancer image datasets. Curr Bioinform 2019; 14(2): 108-14.
Peng L, Peng M, Liao B, Huang G, Li W, Xie D. The advances and challenges of deep learning application in biological big data processing. Curr Bioinform 2018; 13(4): 352-9.
Rajappan S, Rangasamy D. Adaptive genetic algorithm with exploration-exploitation tradeoff for preprocessing microarray datasets. Curr Bioinform 2017; 12(5): 441-51.
Tanchotsrinon W, Lursinsap C, Poovorawan Y. An efficient prediction of hpv genotypes from partial coding sequences by chaos game representation and fuzzy k-nearest neighbor technique. Curr Bioinform 2017; 12(5): 431-40.
Yao Y, Li X, Geng L, Nan X, Qi Z, Liao B. Recent progress in long noncoding RNAs prediction. Curr Bioinform 2018; 13(4): 344-51.
Lu Y, Deng X, Chen J, Wang J, Chen Q, Niu B. Risk analysis of african swine fever in poland based on spatio-temporal pattern and latin hypercube sampling, 2014-2017. BMC Vet Res 2019; 15(1): 160.
[] [PMID: 31118049]
Xiao X, Cheng X, Chen G, Mao Q, Chou KC. pLoc-mGpos: predict subcellular localization of gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC. Genomics 2019; 111(4): 886-92.
Qiu WR, Sun BQ, Xiao X, Xu ZC, Jia JH, Chou KC. iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics 2018; 110(5): 239-46.
[PMID: 29107015]
Feng P, et al. iDNA6mA-PseKNC: identifying DNA N 6 -methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC. Genomics 2018.
Cheng X, Xiao X, Chou KC. pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics 2018; 110(1): 50-8.
[] [PMID: 28818512]
Cheng X, Xiao X, Chou KC. pLoc-mGneg: predict subcellular localization of gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics 2017; S0888- 7543(17): 30102-7.
[PMID: 28989035]
Taguchi YH, Wang H. Genetic association between amyotrophic lateral sclerosis and cancer. Genes (Basel) 2017; 8(10)E243
[] [PMID: 28953220]
Bloomingdale P, Mager DE. Machine learning models for the prediction of chemotherapy-induced peripheral neuropathy. Pharm Res 2019; 36(2): 35.
[] [PMID: 30617559]
Consonni V, Todeschini R. Molecular descriptors Recent advances in QSAR studies: methods and applications. Dordrecht: Springer Netherlands 2010; pp. 29-102.
Ansary I, Roy H, Das A, Mitra D. Regioselective synthesis, molecular descriptors of (1,5-Disubstituted 1,2,3-Triazolyl)Coumarin/Quinolone derivatives and their docking studies against cancer targets. ChemistrySelect 2019; 4(12): 3486-94.
Basak SC. Editor’s perspective: molecular descriptor landscape in the twenty first century and its proper use for computer-aided drug design. Curr Comput Aided Drug Des 2019; 15(1): 1-2.
[] [PMID: 30569845]
Benguerba Y, Alnashef I, Erto A, Balsamo M, et al. A quantitative prediction of the viscosity of amine based DESs using S sigma-profile molecular descriptors. J Mol Struct 2019; 1184: 357-63.
Bian L, Sorescu DC, Chen L, et al. Machine-learning identification of the sensing descriptors relevant in molecular interactions with metal nanoparticle-decorated nanotube field-effect transistors. ACS Appl Mater Interfaces 2019; 11(1): 1219-27.
[] [PMID: 30547572]
Chang ED, Hogstrand C, Miller TH, Owen SF, Bury NR. The use of molecular descriptors to model pharmaceutical uptake by a fish primary gill cell culture epithelium. Environ Sci Technol 2019; 53(3): 1576-84.
[] [PMID: 30589539]
Esmaeili E, Shafiei F. QSAR models to predict physico-chemical properties of some barbiturate derivatives using molecular descriptors and genetic algorithm-multiple linear regressions. Iranian Chemical Communication 2019; 7(2): 170-9.
Jeschke S, Cole IS. 3D-QSAR for binding constants of β-cyclodextrin host-guest complexes by utilising spectrophores as molecular descriptors. Chemosphere 2019; 225: 135-8.
[] [PMID: 30870630]
Ma H, Peng Q, An Z, Huang W, Shuai Z. Efficient and long-lived room-temperature organic phosphorescence: theoretical descriptors for molecular designs. J Am Chem Soc 2019; 141(2): 1010-5.
[] [PMID: 30565929]
Martínez-Santiago O, Marrero-Ponce Y, Vivas-Reyes R, et al. Higher-order and mixed discrete derivatives such as a novel graph- theoretical invariant for generating new molecular descriptors. Curr Top Med Chem 2019; 19(11): 944-56.
[] [PMID: 31074367]
Nazeer W, Farooq A, Younas M, Munir M, Kang SM. On molecular descriptors of carbon nanocones. Biomolecules 2018; 8(3)E92
[] [PMID: 30205520]
Nichols CM, Dodds JN, Rose BS, et al. Untargeted molecular discovery in primary metabolism: collision cross section as a molecular descriptor in ion mobility-mass spectrometry. Anal Chem 2018; 90(24): 14484-92.
[] [PMID: 30449086]
Rácz A, Bajusz D, Héberger K. Intercorrelation limits in molecular descriptor preselection for QSAR/QSPR. Mol Inform 2019; 38(8-9)1800154
[] [PMID: 30945814]
Viarengo L, Whitty A. Development of macrocycle-specific molecular descriptors and their application in machine learning. Protein Sci 2018; 27: 221-1.
Winter R, Montanari F, Noé F, Clevert DA. Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem Sci (Camb) 2018; 10(6): 1692-701.
[] [PMID: 30842833]
Zhang PB, Yang ZX. A novel AdaBoost framework with robust threshold and structural optimization. IEEE Trans Cybern 2018; 48(1): 64-76.
[] [PMID: 27898387]
Niu B, Lu Y, Wang J, et al. 2D-SAR, Topomer CoMFA and molecular docking studies on avian influenza neuraminidase inhibitors. Comput Struct Biotechnol J 2018; 17: 39-48.
[] [PMID: 30595814]
Lu Y, Wang S, Wang J, et al. An epidemic avian influenza prediction model based on google trends. Lett Org Chem 2019; 16(4): 303-10.
Cortes C, Vapnik VN. Support vector networks. Mach Learn 1995; 3: 273-97.
Vapnik VN. An overview of statistical learning theory. IEEE Trans Neural Netw 1999; 10(5): 988-99.
[] [PMID: 18252602]
Cortes C, Vapnik V. Support-vector networks. Mach Learn 1995; 20(3): 273-97.
Zhang M, Su Q, Lu Y, Zhao M, Niu B. Application of machine learning approaches for protein-protein interactions prediction. Med Chem 2017; 13(6): 506-14.
[] [PMID: 28530547]
Chen C-H, Tanaka K, Funatsu K. Random forest model with combined features: a practical approach to predict liquid-crystalline property. Mol Inform 2019; 38(4)e1800095
[] [PMID: 30548221]
Rostami Z, Pourbasheer E. A comparative QSAR study of aryl-substituted isobenzofuran-1(3H)-ones inhibitors. Iranian Chemical Communication 2019; 7(1): 79-92.
Ai H, Wu X, Zhang L, et al. QSAR modelling study of the bioconcentration factor and toxicity of organic compounds to aquatic organisms using machine learning and ensemble methods. Ecotoxicol Environ Saf 2019; 179: 71-8.
[] [PMID: 31026752]
C45: Programs for Machine Learning. Elsevier Science & Technology Books 1992.
Chen G, Peijnenburg W, Kovalishyn V, Vijver M. Development of nanostructure-activity relationships assisting the nanomaterial hazard categorization for risk assessment and regulatory decision-making. RSC Advances 2016; 6(57): 52227-35.
Cheng F, Shen J, Yu Y, et al. In silico prediction of tetrahymena pyriformis toxicity for diverse industrial chemicals with substructure pattern recognition and machine learning methods. Chemosphere 2011; 82(11): 1636-43.
[] [PMID: 21145574]
Kong Y, Yan A. QSAR models for predicting the bioactivity of polo-like kinase 1 inhibitors. Chemom Intell Lab Syst 2017; 167: 214-25.
Sun X, Li Y, Liu X, et al. Classification of bioaccumulative and non-bioaccumulative chemicals using statistical learning approaches. Mol Divers 2008; 12(3-4): 157-69.
[] [PMID: 18937041]
Yang X-G, Chen D, Wang M, Xue Y, Chen YZ. Prediction of antibacterial compounds by machine learning approaches. J Comput Chem 2009; 30(8): 1202-11.
[] [PMID: 18988254]
Ambure P, Halder AK, González Díaz H, Cordeiro MNDS. QSAR-Co: an open source software for developing robust multitasking or multitarget classification-based QSAR models. J Chem Inf Model 2019; 59(6): 2538-44.
[] [PMID: 31083984]
Ancuceanu R, Dinu M, Neaga I, Laszlo FG, Boda D. Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells. Oncol Lett 2019; 17(5): 4188-96.
[] [PMID: 31007759]
Cardoso-Silva J, Papadatos G, Papageorgiou LG, Tsoka S. Optimal piecewise linear regression algorithm for QSAR modelling. Mol Inform 2019; 38(3)e1800028
[] [PMID: 30251339]
García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F, et al. Enhancing acute oral toxicity predictions by using consensus modeling and algebraic form-based 0D-to-2D molecular encodes. Chem Res Toxicol 2019; 32(6): 1178-92.
[] [PMID: 31066547]
Kaneko H. Data visualization, regression, applicability domains and inverse analysis based on generative topographic mapping. Mol Inform 2019; 38(3) e1800088
[] [PMID: 30259699]
Veríssimo GC, Menezes Dutra EF, Teotonio Dias AL, et al. HQSAR and random forest-based QSAR models for anti-T. Vaginalis activities of nitroimidazoles derivatives. J Mol Graph Model 2019; 90: 180-91.
[] [PMID: 31100677]
Chen W, Peng J, Hong H, et al. Landslide susceptibility modelling using GIS-based machine learning techniques for chongren county, jiangxi province, china. Sci Total Environ 2018; 626: 1121-35.
[] [PMID: 29898519]
Farahani FV, Ahmadi A, Zarandi MHF. Hybrid intelligent approach for diagnosis of the lung nodule from CT images using spatial kernelized fuzzy c-means and ensemble learning. Math Comput Simul 2018; 149: 48-68.
Jain S, Kotsampasakou E, Ecker GF. Comparing the performance of meta-classifiers-a case study on selected imbalanced data sets relevant for prediction of liver toxicity. J Comput Aided Mol Des 2018; 32(5): 583-90.
[] [PMID: 29626291]

Rights & Permissions Print Export Cite as
© 2022 Bentham Science Publishers | Privacy Policy