Generic placeholder image

Combinatorial Chemistry & High Throughput Screening

Editor-in-Chief

ISSN (Print): 1386-2073
ISSN (Online): 1875-5402

Using Machine Learning Methods to Predict Experimental High Throughput Screening Data

Author(s): Cherif Mballo and Vladimir Makarenkov

Volume 13, Issue 5, 2010

Page: [430 - 441] Pages: 12

DOI: 10.2174/138620710791292958

Price: $65

Abstract

High throughput screening (HTS) remains a very costly process notwithstanding many recent technological advances in the field of biotechnology. In this study we consider the application of machine learning methods for predicting experimental HTS measurements. Such a virtual HTS analysis can be based on the results of real HTS campaigns carried out with similar compounds libraries and similar drug targets. In this way, we analyzed Test assay from McMaster University Data Mining and Docking Competition [1] using binary decision trees, neural networks, support vector machines (SVM), linear discriminant analysis, k-nearest neighbors and partial least squares. First, we studied separately the sets of molecular and atomic descriptors in order to establish which of them provides a better prediction. Then, the comparison of the six considered machine learning methods was made in terms of false positives and false negatives, methods sensitivity and enrichment factor. Finally, a variable selection procedure allowing one to improve the methods sensitivity was implemented and applied in the framework of polynomial SVM.

Keywords: CART, decision trees, drug target, hit, k-nearest neighbors (kNN), linear discriminant analysis (LDA), neural networks (NN), partial least squares (PLS), ROC curve, sampling, support vector machines (SVM), virtual high throughput screening


Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy