A Divide-and-Conquer Strategy for the Prediction of Protein Contact Map
Cosme E. Santiesteban-Toca,
Gerardo M. Casanola-Martin,
Jesus S. Aguilar-Ruiz.
The prediction of contact maps in protein is a challenging topic for the determination of three-dimensional protein
structures. In this paper, we introduce Forest of Decision Trees, a methodology for the prediction of protein contact
maps based on (1) a divide-and-conquer approach to analyze the prediction problem; (2) a codification vector that combines
the information obtained from the target amino acids neighborhood, and the sub-sequence between them; (3) an ensemble
of classifiers that employs a hybrid of Genetic Algorithms and Decision Trees as base classifiers; and (4) a rulebased
interpretation mechanism. The comparison against the top sequence-based methods in CASP10 showed that our
predictor is very competitive, showing a high reliability. Their main advantage is its capability to generate a humancomprehensible
rule-based interpretation mechanism, giving the specialist some clues to find an easier and interpretable
solution for the protein-folding recognition and the prediction of unknown structures.
Keywords: CASP10, contact maps prediction, decision trees, genetic algorithms, multiple classifier systems, protein structure
Rights & PermissionsPrintExport