Background: Virtual Screening (VS) has emerged as an important tool in the drug development process,
as it conducts efficient in silico searches over millions of compounds, ultimately increasing yields of potential
drug leads. As a subset of Artificial Intelligence (AI), Machine Learning (ML) is a powerful way of conducting
VS for drug leads. ML for VS generally involves assembling a filtered training set of compounds, comprised
of known actives and inactives. After training the model, it is validated and, if sufficiently accurate, used on previously
unseen databases to screen for novel compounds with desired drug target binding activity.
Objective: The study aims to review ML-based methods used for VS and applications to Alzheimer’s Disease
(AD) drug discovery.
Methods: To update the current knowledge on ML for VS, we review thorough backgrounds, explanations, and
VS applications of the following ML techniques: Naïve Bayes (NB), k-Nearest Neighbors (kNN), Support Vector
Machines (SVM), Random Forests (RF), and Artificial Neural Networks (ANN).
Results: All techniques have found success in VS, but the future of VS is likely to lean more largely toward the
use of neural networks – and more specifically, Convolutional Neural Networks (CNN), which are a subset of
ANN that utilize convolution. We additionally conceptualize a work flow for conducting ML-based VS for potential
therapeutics for AD, a complex neurodegenerative disease with no known cure and prevention. This both
serves as an example of how to apply the concepts introduced earlier in the review and as a potential workflow
for future implementation.
Conclusion: Different ML techniques are powerful tools for VS, and they have advantages and disadvantages
albeit. ML-based VS can be applied to AD drug development.