Virtual High Throughput Screening Using Combined Random Forest and Flexible Docking

Author(s): Dariusz Plewczynski, Marcin von Grotthuss, Leszek Rychlewski, Krzysztof Ginalski.

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 12 , Issue 5 , 2009

Become EABM
Become Reviewer


We present here the random forest supervised machine learning algorithm applied to flexible docking results from five typical virtual high throughput screening (HTS) studies. Our approach is aimed at: i) reducing the number of compounds to be tested experimentally against the given protein target and ii) extending results of flexible docking experiments performed only on a subset of a chemical library in order to select promising inhibitors from the whole dataset. The random forest (RF) method is applied and tested here on compounds from the MDL drug data report (MDDR). The recall values for selected five diverse protein targets are over 90% and the performance reaches 100%. This machine learning method combined with flexible docking is capable to find 60% of the active compounds for most protein targets by docking only 10% of screened ligands. Therefore our in silico approach is able to scan very large databases rapidly in order to predict biological activity of small molecule inhibitors and provides an effective alternative for more computationally demanding methods in virtual HTS.

Keywords: Virtual high throughput screening, compound identification, protein target specificity, MDL drug data report, machine- learning methods, atom pairs, random forest

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2009
Page: [484 - 489]
Pages: 6
DOI: 10.2174/138620709788489000
Price: $65

Article Metrics

PDF: 3