Knowledge about the topology of G protein-coupled receptors (GPCRs) can be very useful in predicting
diverse range of properties about these proteins, such as function, three dimensional structure, and ligand binding site.
Considering that only few GPCRs have known structures, many computational efforts have been carried out to develop
methods for predicting their topology.
A novel method to predict the location and the length of transmembrane helices in GPCRs was proposed. This method
consists of a “one by one” amino acid feature extraction window which makes it possible for the method to learn the
amino acid distribution in helical segments of GPCR proteins. It is based on hidden Markov model (HMM) with a specific
architecture that takes advantage of Viterbi decoding algorithm and the observed frequency values for adjusting the
The prediction capability of the method was evaluated for per-protein, per-segment and per-residue accuracies on two
datasets consisting of 649 (at least one GPCR from each family) and 2898 (all GPCRs) sequences extracted from UniProt
database and compared with other commonly used existing methods. It was found that in all three assessments, the
prediction accuracies for the new method on the larger dataset, i.e., 2898 GPCRs, were higher than that obtained by other
methods. The results showed that our method was able to predict the topology of GPCR proteins without any sequence
length limitation with the accuracies of 88.9 % and 87.4% for the small (i.e., 649 GPCRs) and large (i.e., 2898 GPCRs)
datasets, respectively. (Availability status: The source code is available upon request from the authors)