Background: Drug-induced Liver Injury (DILI) is a leading cause of drug failure, accounting for nearly 20% of drug withdrawal. Thus, there has been a great demand for in silico DILI prediction models for successful drug discovery. To date, various models have been developed for DILI prediction; however, building an accurate model for practical use in drug discovery remains challenging.
Methods: We constructed an ensemble model composed of three high-performance DILI prediction models to utilize the unique advantage of each machine learning algorithm.
Results: The ensemble model exhibited high predictive performance, with an area under the curve of 0.88, sensitivity of 0.83, specificity of 0.77, F1-score of 0.82, and accuracy of 0.80. When a test dataset collected from the literature was used to compare the performance of our model with publicly available DILI prediction models, our model achieved an accuracy of 0.77, sensitivity of 0.82, specificity of 0.72, and F1-score of 0.79, which were higher than those of the other DILI prediction models. As many published DILI prediction models are not available for public access, which hinders in silico drug discovery, we made our DILI prediction model publicly accessible (http://ssbio.cau.ac.kr/software/dili/).
Conclusion: We expect that our ensemble model may facilitate advancements in drug discovery by providing a highly predictive model and reducing the drug withdrawal rate.
[http://dx.doi.org/10.1093/toxsci/kft189 ] [PMID: 23997115]