iRSpotH-TNCPseAAC: Identifying Recombination Spots in Human by Using Pseudo Trinucleotide Composition With an Ensemble of Support Vector Machine Classifiers

Author(s): Zhao-Chun Xu, Wang-Ren Qiu*, Xuan Xiao*.

Journal Name: Letters in Organic Chemistry

Volume 14 , Issue 9 , 2017

Become EABM
Become Reviewer

Graphical Abstract:


Background: For the formation of human gametes, meiotic recombination is crucial. Meanwhile, it has played an important role in the process that generates genetic diversity for that it is a defining event in the formation of human sperm and eggs. However, the recombination isn't a random occurrence across a genome, it usually occurs in some genomic regions, the so-called “hotspots”, with higher probability, while in the so-called “coldspots” with lower probability. Research has shown that new combinations of genetic variations can be provided by recombination. Therefore, the useful insights for in-depth studying of the genome evolution process and the mechanism of recombination would be provided based on the information of the coldspots and hotspots. Currently, the recombination regions would be determined by experiments, but it's a tedious job, which generally requires precious instruments and takes a long time. So in the study the work is starting to be studied by computational predicting models to address the above problems.

Method: In this paper, a new predictor, called ‘iRSpotH-TNCPseAAC’ was developed to identify the human recombination coldspots and hotspots. In the new discrete predictive model, a feature vector called ‘pseudo trinucleotide composition’ or PseTNC is proposed to formulate the given DNA segment with its sequence-order information as complete as possible.

Results: In this study, based on the rigorous jackknife test the overall success rate obtained by iRSpotH- TNCPseAAC is higher than 93% in identifying human’s recombination spots, and with mean success rate is 76.07% of the concerned 18 chromosomes. It means that our predictor can become a useful complementary tool in this area. Not only that, the PseTNC method can be used to further explore many other DNA-related problems. Finally, a web- server called iRSpotH-TNCPseAAC, which has the advantages of easy operation and convenient for using, is built and freely accessible at

Conclusion: To timely acquire the information of recombination spots in DNA sequence is very significant to make in-depth study on epigenetic inheritance and analyze human diseases. Furthermore, it will facilitate drug development. A certain conclusion is that the iRSpotH-TNCPseAAC predictor may become a very practical online predictive high throughput tools in identifying recombination spots.

Keywords: Pseudo amino acid composition, support vector machine, web-server, iRSpotH-TNCPseAAC, meiosis, coldspots, hotspots.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2017
Page: [703 - 713]
Pages: 11
DOI: 10.2174/1570178614666170608125909
Price: $65

Article Metrics

PDF: 16
PRC: 1