Protein-Protein-Interactions (PPIs) are involved in almost all the cellular processes
and understanding the structural basis of PPIs remains an important endeavor. The
identification of the interface residues may shed light in many important aspects like drug
development, elucidation of molecular pathways, generation of protein mimetics and understanding
of disease mechanisms as well as development of docking methodologies to build
structural models of protein complexes. Over the past few years, advances in high-throughput
PPI identification techniques, such as yeast two-hybrid analysis and affinity purification coupled
with mass spectrometry, have enabled the researchers to identify sets of interacting proteins in yeast, Drosophila
and other organisms. Unfortunately, these experimental methods do not provide residue level insight
into the structure of the interactions between the proteins. The uses of X-Ray crystallography and nuclear
magnetic resonance (NMR) spectroscopy to determine the structural basis of an interaction are time consuming
and overall expensive. In response to these difficulties, a number of different bioinformatics algorithms
with varying degrees of accuracies have been developed that use a wide variety of data sources to predict PPIs
and modes of binding between proteins. Machine learning techniques such as Support Vector Machines
(SVMs) and Random Forests (RFs) have been used recently to solve problems such as prediction of catalytic
residues and prediction and analysis of structure-based PPI interfaces. Previous machine learning approaches
to the PPI interface prediction problems used features pertaining to evolutionary amino acid sequence conservation,
phylogeny, and GO (Gene Ontology) protein annotation and, in most of the cases, protein structures.
Till date there are very few computational methods available that are based solely on protein sequences.
Keywords: Docking, NMR, Protein-protein interactions, RF, SVM, X-ray crystallography.
Rights & PermissionsPrintExport