Title:Predicting Drug-Target Interactions Based on Small Positive Samples
VOLUME: 19 ISSUE: 5
Author(s):Pengwei Hu, Keith C.C. Chan* and Yanxing Hu
Affiliation:Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon, Department of Computing, Hong Kong Polytechnic University, Hung Hom, Kowloon
Keywords:Protein and compound representations, one-class classification, positive samples, drug target interaction (DTI),
drug discovery, computational methods.
Abstract:Background: A basic task in drug discovery is to find new medication in the form of candidate compounds
that act on a target protein. In other words, a drug has to interact with a target and such drug-target interaction
(DTI) is not expected to be random. Significant and interesting patterns are expected to be hidden in them. If
these patterns can be discovered, new drugs are expected to be more easily discoverable.
Objective: Currently, a number of computational methods have been proposed to predict DTIs based on their
similarity. However, such as approach does not allow biochemical features to be directly considered. As a result,
some methods have been proposed to try to discover patterns in physicochemical interactions. Since the number
of potential negative DTIs are very high both in absolute terms and in comparison to that of the known ones, these
methods are rather computationally expensive and they can only rely on subsets, rather than the full set, of negative
DTIs for training and validation. As there is always a relatively high chance for negative DTIs to be falsely
identified and as only partial subset of such DTIs is considered, existing approaches can be further improved to
better predict DTIs.
Method: In this paper, we present a novel approach, called ODT (one class drug target interaction prediction), for
such purpose. One main task of ODT is to discover association patterns between interacting drugs and proteins
from the chemical structure of the former and the protein sequence network of the latter. ODT does so in two
phases. First, the DTI-network is transformed to a representation by structural properties. Second, it applies a oneclass
classification algorithm to build a prediction model based only on known positive interactions.
Results: We compared the best AUROC scores of the ODT with several state-of-art approaches on Gold standard
data. The prediction accuracy of the ODT is superior in comparison with all the other methods at GPCRs dataset
and Ion channels dataset.
Conclusion: Performance evaluation of ODT shows that it can be potentially useful. It confirms that predicting
potential or missing DTIs based on the known interactions is a promising direction to solve problems related to
the use of uncertain and unreliable negative samples and those related to the great demand in computational resources.