High-throughput protein-protein interaction (PPI) datasets make it possible to exploit the interaction relationship between proteins to predict functions for those proteins that are still functionally unannotated. Although the clustering based approach has proved to be one of effective methods in some cases for protein function prediction, in most cases the prediction results are unsatisfactory. How to define a better similarity/distance measurement between proteins, how to choose proper clustering methods and how to select feature functions from clusters for better predictions still remain challenges to the improvement of the clustering based prediction approach. On the other hand, predicting functions at different functional layers for the unannotated proteins to provide more meaningful information about protein functions was rarely investigated by the existing algorithms.
Results: In this paper, we propose algorithms that address the selection of feature functions from clusters to increase the prediction quality of clustering based prediction methods. Meanwhile, clustering based protein function prediction methods can effectively predict protein functions at different functional layers when incorporating our algorithms of cluster feature function selection. Evaluations on real PPI datasets demonstrated the effectiveness of the proposed algorithms.
Conclusion: The proposed algorithms of cluster feature function selection reasonably reflect the intrinsic relationship among proteins. The multi-layered function prediction supported by our proposed algorithms provides more meaningful information for better understanding protein functions