How Wrong Can We Get? A Review of Machine Learning Approaches and Error Bars

Author(s): Anton Schwaighofer, Timon Schroeter, Sebastian Mika, Gilles Blanchard.

Journal Name: Combinatorial Chemistry & High Throughput Screening
Accelerated Technologies for Biotechnology, Bioassays, Medicinal Chemistry and Natural Products Research

Volume 12 , Issue 5 , 2009

Become EABM
Become Reviewer


A large number of different machine learning methods can potentially be used for ligand-based virtual screening. In our contribution, we focus on three specific nonlinear methods, namely support vector regression, Gaussian process models, and decision trees. For each of these methods, we provide a short and intuitive introduction. In particular, we will also discuss how confidence estimates (error bars) can be obtained from these methods. We continue with important aspects for model building and evaluation, such as methodologies for model selection, evaluation, performance criteria, and how the quality of error bar estimates can be verified. Besides an introduction to the respective methods, we will also point to available implementations, and discuss important issues for the practical application.

Keywords: Machine learning, error bars, model building, parameter estimation, decision tree, support vector machine, Gaussian process

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2009
Page: [453 - 468]
Pages: 16
DOI: 10.2174/138620709788489064
Price: $65

Article Metrics

PDF: 16