Background: The Soluble Epoxide Hydrolase (sEH) is a ubiquitously expressed
enzyme in various tissues. The inhibition of the sEH has shown promising results to treat
hypertension, alleviate pain and inflammation.
Objective: In this study, the power of machine learning has been employed to develop a predictive
QSAR model for a large set of sEH inhibitors.
Methods: In this study, the random forest method was employed to make a valid model for the
prediction of sEH inhibition. Besides, two new methods (Treeinterpreter python package and
LIME, Local Interpretable Model-agnostic Explanations) have been exploited to explain and
interpret the model.
Results: The performance metrics of the model were as follows: R2=0.831, Q2=0.565,
RMSE=0.552 and R2
pred=0.595. The model also demonstrated good predictability on the two extra
external test sets at least in terms of ranking. The Spearman’s rank correlation coefficients for
external test set 1 and 2 were 0.872 and 0.673, respectively. The external test set 2 was a diverse
one compared to the training set. Therefore, the model could be used for virtual screening to enrich
potential sEH inhibitors among a diverse compound library.
Conclusion: As the model was solely developed based on a set of simple fragmental descriptors,
the model was explained by two local interpretation algorithms, and this could guide medicinal
chemists to design new sEH inhibitors. Moreover, the most important general descriptors
(fragments) suggested by the model were consistent with the available crystallographic data. The
model is available as an executable binary at http://www.pharm-sbg.com and