Background: Chemical toxicity is an important reason for late-stage failure in drug R&D.
However, it is time-consuming and expensive to identify the multiple toxicities of compounds using
the traditional experiments. Thus, it is attractive to build an accurate prediction model for the toxicity
profile of compounds.
Materials and Methods: In this study, we carried out a research on six types of toxicities: (I) Acute
Toxicity; (II) Mutagenicity; (III) Tumorigenicity; (IV) Skin and Eye Irritation; (V) Reproductive
Effects; (VI) Multiple Dose Effects, using local lazy learning (LLL) method for multi-label learning.
17,120 compounds were split into the training set and the test set as a ratio of 4:1 by using the
Kennard-Stone algorithm. Four types of properties, including molecular fingerprints (ECFP_4 and
FCFP_4), descriptors, and chemical-chemical-interactions, were adopted for model building.
Results: The model ‘ECFP_4+LLL’ yielded the best performance for the test set, while balanced
accuracy (BACC) reached 0.692, 0.691, 0.666, 0.680, 0.631, 0.599 for six types of toxicities,
respectively. Furthermore, some essential toxicophores for six types of toxicities were identified by
using the Laplacian-modified Bayesian model.
Conclusion: The accurate prediction model and the chemical toxicophores can provide some
guidance for designing drugs with low toxicity.