In silico Prediction of Inhibitory Constant of Thrombin Inhibitors Using Machine Learning

Author(s): Junnan Zhao, Lu Zhu, Weineng Zhou, Lingfeng Yin, Yuchen Wang, Yuanrong Fan, Yadong Chen*, Haichun Liu*.

Journal Name: Combinatorial Chemistry & High Throughput Screening

Volume 21 , Issue 9 , 2018

Become EABM
Become Reviewer

Abstract:

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors.

Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors.

Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.

Keywords: Thrombin inhibitors, inhibitory constant, machine learning, model evaluation, descriptor selection, regression model.

Rights & PermissionsPrintExport Cite as


Article Details

VOLUME: 21
ISSUE: 9
Year: 2018
Page: [662 - 669]
Pages: 8
DOI: 10.2174/1386207322666181220130232
Price: $58

Article Metrics

PDF: 15