Objective: Drug-induced liver injury (DILI) is a major cause of drug withdrawal. The
chemical properties of the drug, especially drug metabolites, play key roles in DILI. Our goal is to
construct a QSAR model to predict drug hepatotoxicity based on drug metabolites.
Materials and Methods: 64 hepatotoxic drug metabolites and 3,339 non-hepatotoxic drug
metabolites were gathered from MDL Metabolite Database. Considering the imbalance of the
dataset, we randomly split the negative samples and combined each portion with all the positive
samples to construct individually balanced datasets for constructing independent classifiers. Then,
we adopted an ensemble approach to make prediction based on the results of all individual classifiers
and applied the minimum Redundancy Maximum Relevance (mRMR) feature selection method to
select the molecular descriptors. Eventually, for the drugs in the external test set, a Bayesian
inference method was used to predict the hepatotoxicity of a drug based on its metabolites.
Results: The model showed the average balanced accuracy=78.47%, sensitivity =74.17%, and
specificity=82.77%. Five molecular descriptors characterizing molecular polarity, intramolecular
bonding strength, and molecular frontier orbital energy were obtained. When predicting the
hepatotoxicity of a drug based on all its metabolites, the sensitivity, specificity and balanced
accuracy were 60.38%, 70.00% and 65.19%, respectively, indicating that this method is useful for
identifying the hepatotoxicity of drugs.
Conclusions: We developed an in silico model to predict hepatotoxicity of drug metabolites.
Moreover, Bayesian inference was applied to predict the hepatotoxicity of a drug based on its
metabolites which brought out valuable high sensitivity and specificity.