Background: It has been shown in numerous recent studies that long non-coding RNAs (lncRNAs) play a vital role in the regulation of various biological processes, as well as serve as a basis for understanding the causes of human illnesses. Thus, many researchers have developed matrix completion approaches to infer lncRNA–disease connections and enhance prediction performance by using similarity information.
Objective: Most matrix completion approaches are solely based on the first-order or second-order similarity between nodes, and higher-order similarity is rarely considered. In view of this, we developed a computational method to incorporate higher-order similarity information into the similarity network with different weights using a decay function designed by a random walk with restart (DHOSGR).
Methods: First, considering that the information will decay as the distance increases during network propagation, we defined a novel decay high-order similarity by combining the similarity matrix and its high-order similarity information through a decay function to construct a similarity network. Then, we applied the similarity network to the objective function as a graph regularization term. Finally, a proximal splitting algorithm was used to perform matrix completion to infer relationships between diseases and lncRNAs.
Results: In the experiment, DHOSGR achieves a superior performance in leave-one-out cross validation (LOOCV) and 100 times 5-fold cross validation (5-fold-CV), with AUC values of 0.9459 and 0.9334 ± 0.0016, respectively, which are better than other five previous models. Moreover, case studies of three diseases (leukemia, lymphoma, and squamous cell carcinoma) demonstrated that DHOSGR can reliably predict associated lncRNAs.
Conclusion: DHOSGR can serve as a high efficiency calculation model for predicting lncRNAdisease associations.
[http://dx.doi.org/10.1146/annurev-biochem-051410-092902] [PMID: 22663078]