Comparative Analysis of DNA Motif Discovery Algorithms: A Systemic Review

Author(s): Fatma A. Hashim, Mai S. Mabrouk*, Walid A.L. Atabany

Journal Name: Current Cancer Therapy Reviews

Volume 15 , Issue 1 , 2019

Become EABM
Become Reviewer

Graphical Abstract:


Background: Bioinformatics is an interdisciplinary field that combines biology and information technology to study how to deal with the biological data. The DNA motif discovery problem is the main challenge of genome biology and its importance is directly proportional to increasing sequencing technologies which produce large amounts of data. DNA motif is a repeated portion of DNA sequences of major biological interest with important structural and functional features. Motif discovery plays a vital role in the antibody-biomarker identification which is useful for diagnosis of disease and to identify Transcription Factor Binding Sites (TFBSs) that help in learning the mechanisms for regulation of gene expression. Recently, scientists discovered that the TFs have a mutation rate five times higher than the flanking sequences, so motif discovery also has a crucial role in cancer discovery.

Methods: Over the past decades, many attempts use different algorithms to design fast and accurate motif discovery tools. These algorithms are generally classified into consensus or probabilistic approach.

Results: Many of DNA motif discovery algorithms are time-consuming and easily trapped in a local optimum.

Conclusion: Nature-inspired algorithms and many of combinatorial algorithms are recently proposed to overcome the problems of consensus and probabilistic approaches. This paper presents a general classification of motif discovery algorithms with new sub-categories. It also presents a summary comparison between them.

Keywords: Bioinformatics, motif, enumerative approach, probabilistic approach, natural-inspired, metaheuristic.

Xiong J. Essential bioinformatics Cambridge University Press: Texas 2006; pp. 3-6.
Zhang X, Zhou X, Wang X. Basics for Bioinformatics. In: Basics of Bioinformatics, Springer, Berlin, Heidelberg 2013; pp. 1-25.
Al Bataineh M, Al-qudah Z, Al-Zaben A. A novel Iterative Sequential Monte Carlo (ISMC) algorithm for motif discovery. IET Signal Process 2015; 10(5): 504-13.
Liu H, Han F, Zhou H, Yan X, Kosik KS. Fast motif discovery in short sequences.
Reddy US, Arock M, Reddy A. Planted (l, d)-motif finding using particle swarm optimization. IJCA Special Issue Evol Comput 2010; 2: 51-6.
Keith JM Bioinformatics: Volume I Data, Sequence Analysis and Evolution (Methods in Molecular Biology) India: Humana Press New Delhi, 2008; 562.
Zhang Y, Wang P, Yan M. An entropy-based position projection algorithm for motif discovery. BioMed Res Int 2016; 2016: 11.
Pavesi G, Mereghetti P, Mauri G, Pesole G. Weeder web: Discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res 2004; 32(Suppl. 2): W199-203.
Karaboga D, Aslan S. A discrete artificial bee colony algorithm for detecting transcription factor binding sites in DNA sequences. Genet Mol Res 2016; 15(2): 1-11.
Bailey TL. DREME: Motif discovery in transcription factor ChIP-seq data. Bioinformatics 2011; 27(12): 1653-9.
Sharov AA, Ko MS. Exhaustive search for over-represented DNA sequence motifs with CisFinder. DNA Res 2009; 16(5): 261-73.
Jia C, Carson MB, Wang Y, Lin Y, Lu H. A new exhaustive method and strategy for finding motifs in ChIP-enriched regions. PLoS One 2014; 9(1): e86044.
Yu Q, Huo H, Chen X, Guo H, Vitter JS, Huan J. An efficient algorithm for discovering motifs in large DNA data sets. IEEE Trans Nanobioscience 2015; 14(5): 535-44.
Jensen ST, Liu XS, Zhou Q, Liu JS. Computational discovery of gene regulatory binding motifs: A Bayesian perspective. Stat Sci 2004; 19(1): 188-204.
Bailey TL, Elkan C. The value of prior knowledge in discovering motifs with MEME. Proceedings of the ISMB Conference. 1995 July 16-19; Cambridge, United Kingdom. United States: Stanford Univ. 1996.
Reid JE, Wernisch L. STEME: Efficient EM to find motifs in large data sets. Nucleic Acids Res 2011; 39(18): e126.
Quang D, Xie X. EXTREME: An online EM algorithm for motif discovery. Bioinformatics 2014; 30(12): 1667-73.
Hughes JD, Estep PW, Tavazoie S, Church GM. Computational identification of cis-regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae. J Mol Biol 2000; 296(5): 1205-14.
Liu X, Brutlag DL, Liu JS. BioProspector: Discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput 2001; 2001: 127-38.
Zelinka I. A survey on evolutionary algorithms dynamics and its complexity–Mutual relations, past, present and future. Swarm Evol Comput 2015; 25: 2-14.
Machhi V, Patel MS, Degama J. Motif finding with application to the transcription factor binding sites problem. Int J Comput Appl 2015; 120(15): 7-10.
Wei Z, Jensen ST. GAME: Detecting cis-regulatory elements using a genetic algorithm. Bioinformatics 2006; 22(13): 1577-84.
Goldberg DE. Genetic algorithms in search, optimization and machine learning 1st ed Boston: Addison-Wesley 1989 ISBN: 0201157675.
Koza JR. Genetic programming: On the programming of computers by means of natural selection. Stat Comput 1994; 4: 87.
Storn R, Price K. Differential evolution–a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 1997; 11(4): 341-59.
Beyer H-G, Schwefel H-P. Evolution strategies–A comprehensive introduction. Nat Comput 2002; 1(1): 3-52.
De Jong KA. Evolutionary computation A unified approach. Cambridge, USA: MIT Press 2006.
Civicioglu P, Besdok E. A conceptual comparison of the Cuckoo-search, particle swarm optimization, differential evolution and artificial bee colony algorithms. Artif Intell Rev 2013; 39(4): 315-46.
Viswanathan GM, Afanasyev V, Buldyrev S, Murphy E. Lévy flight search patterns of wandering albatrosses. Nature 1996; 381(6581): 413-5.
Passino KM. Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Sys 2002; 22(3): 52-67.
Shah-Hosseini H. The intelligent water drops algorithm: A nature-inspired swarm-based optimization algorithm. Int J Bio-inspired Comput 2009; 1(1-2): 71-9.
Lei C, Ruan J. A particle swarm optimization-based algorithm for finding gapped motifs. BioData Min 2010; 3(1): 9-10.
Karaboga D, Akay B, Ozturk C. Artificial Bee Colony (ABC) optimization algorithm for training feed-forward neural networks. Proceedings of the International Conference on Modeling Decsions for Artificial Intelligence. 2007 August 16-18; Springer 2007.
Dorigo M, Maniezzo V, Colorni A. Ant system: Optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern B Cybern 1996; 26(1): 29-41.
Chauhan R, Agarwal P. A review: Applying genetic algorithms for motif discovery. Int. J Comput Technol Appl 2012; 3(4): 1510-5.
Sinha S, Tompa M. YMF: A program for discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 2003; 31(13): 3586-8.
Thomas-Chollier M, Herrmann C, Defrance M, Sand O, Thieffry D, van Helden J. RSAT peak-motifs: Motif analysis in full-size ChIP-seq datasets. Nucleic Acids Res 2012; 40(4): e31.
Buhler J, Tompa M. Finding motifs using random projections. J Comput Biol 2002; 9(2): 225-42.
Raphael B, Liu L-T, Varghese G. A uniform projection method for motif discovery in DNA sequences IEEE/ACM Trans Comput Biol Bioinform 2004; 1(2): 91-4.
Wang X, Miao Y. Cheng. Finding motifs in DNA sequences using low-dispersion sequences. J Comput Biol 2014; 21(4): 320-9.
Pevzner PA, Sze S-H. Combinatorial approaches to finding subtle signals in DNA sequences. Proc Int Conf Intell Syst Mol Biol 2000; 8: 269-78.
Satya RV, Mukherjee A. New Algorithms for Finding Monad Patterns in DNA Sequences In: Apostolico A, Melucci M eds String Processing and Information Retrieval SPIRE 2004 Lecture Notes in Computer Science, vol 3246 Springer, Berlin, Heidelberg
Liang S, Samanta MP, Biegel B. cWINNOWER algorithm for finding fuzzy DNA motifs. J Bioinform Comput Biol 2004; 2(01): 47-60.
Yu Q, Huo H, Zhao R, Feng D, Vitter JS, Huan J. RefSelect: A reference sequence selection algorithm for planted (l, d) motif search. BMC Bioinformatics 2016; 17(9): 266.
Lawrence CE, Reilly AA. An Expectation Maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences. Proteins 1990; 7(1): 41-51.
Lee MT. Motif finding Class notes for GCB 535 / CIS 535, Department of Computer and Information Science, University of Pennsylvania, 10 Oct 2004.
Das MK, Dai H-K. A survey of DNA motif finding algorithms. BMC Bioinformatics 2007; 8(7): S21.
Machanick P, Bailey TL. MEME-ChIP: Motif analysis of large DNA datasets. Bioinformatics 2011; 27(12): 1696-7.
Bailey TL, Williams N, Misleh C, Li WW. MEME: Discovering and analyzing DNA and protein sequence motifs. Nucleic Acids Res 2006; 34(Suppl. 2): W369-73.
Bailey TL, Bodén M, Whitington T, Machanick P. The value of position-specific priors in motif discovery using MEME. BMC Bioinformatics 2010; 11(1): 179.
Tanaka E, Bailey TL, Keich U. Improving MEME via a two-tiered significance analysis. Bioinformatics 2014; 30(14): 1965-73.
Ma W, Noble WS, Bailey TL. Motif-based analysis of large nucleotide data sets using MEME-ChIP. Nat Protoc 2014; 9(6): 1428-50.
Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC. Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 1993; 262(5131): 208.
Liu JS, Neuwald AF, Lawrence CE. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J Am Stat Assoc 1995; 90(432): 1156-70.
Xing EP, Wu W, Jordan MI, Karp RM. LOGOS: A modular Bayesian model for de novo motif detection. J Bioinform Comput Biol 2004; 2(01): 127-54.
Siebert M, Söding J. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences. Nucleic Acids Res 2016; 44(13): 6055-69.
Jääskinen V, Parkkinen V, Cheng L, Corander J. Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model. Stat Appl Genet Mol Biol 2014; 13(1): 105-21.
Frith MC, Li MC, Weng Z. Cluster-Buster: Finding dense clusters of motifs in DNA sequences. Nucleic Acids Res 2003; 31(13): 3666-8.
Fister I Jr, Yang X-S, Fister I, Brest J, Fister D. A brief review of nature-inspired algorithms for optimization. Electrotech Rev 2013; 80(3): 116-22.
Malhotra R, Singh N, Singh Y. Genetic algorithms: Concepts, design for optimization of process controllers. Comput Inf Sci 2011; 4(2): 39-54.
Liu FF, Tsai JJ, Chen R-M, Chen S, Shih S. FMGA: Finding motifs by genetic algorithm. Proceedings of the Fourth IEEE Symposium on Bioinformatics and Bioengineering; 2004 May 21; Taiwan, IEEE 2004, 459-66
Che D, Song Y, Rasheed K. MDGA: Motif discovery using a genetic algorithm. Proceedings of the 7th annual conference on Genetic and evolutionary computation; 2005 June 25-29; Washington DC, USA. ACM 2005, 447-52
Gutierrez JB, Frith M, Nakai K. A genetic algorithm for motif finding based on statistical significance. Proceedings of the International Conference on Bioinformatics and Biomedical Engineering; 2015 Nov 2-4; Washington, USA; Granada: Springer 2015, 438- 449
Vijayvargiya S, Shukla P. A genetic algorithm with clustering for finding regulatory motifs in DNA sequences. Int J Comput Appl 2011; 1: 6-10.
Paul TK, Iba H. Identification of weak motifs in multiple biological sequences using genetic algorithm. Proceedings of the 8th annual conference on genetic and evolutionary computation 2006 July 8-12; Washington, USA. 271-8.
Huo H, Zhao Z, Stojkovic V, Liu L. Optimizing genetic algorithm for motif discovery. Math Comput Model 2010; 52(11): 2011-20.
Li L. GADEM: A genetic algorithm guided formation of spaced dyads coupled with an EM algorithm for motif discovery. J Comput Biol 2009; 16(2): 317-29.
Wang X, Miao Y. GAEM: A hybrid algorithm incorporating GA with EM for planted edited motif finding problem. Curr Bioinform 2014; 9(5): 463-9.
Le T, Altman T, Gardiner K. HIGEDA: A hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences. Bioinformatics 2010; 26(3): 302-9.
Fan Y, Wu W, Liu R, Yang W. An iterative algorithm for motif discovery. Procedia Comput Sci 2013; 24: 25-9.
Thompson W, Rouchka EC, Lawrence CE. Gibbs recursive sampler: Finding transcription factor binding sites. Nucleic Acids Res 2003; 31(13): 3580-5.
Lo N, Changchien S, Chang Y, Lu T. Human promoter prediction based on sorted consensus sequence patterns by genetic algorithms. Proceedings of the International Congress on Biological and Medical Engineering 2002; 111-2.
Kennedy J. Particle swarm optimization. In: Encyclopedia of machine learning. 1st ed. US: Springer 2011; pp. 760-6.
Mokhtar N. DNA sequence design for DNA computation based on binary particle swarm optimization. Int J Innov Comput, Inf Control 2012; 8(5B): 3441-50.
Hardin CT, Rouchka EC. DNA motif detection using particle swarm optimization and expectation-maximization. Proc IEEE Swarm Intell Symp 2005 2005; 2005: 181-.
Chang BC, Ratnaweera A, Halgamuge SK, Watson HC. Particle swarm optimisation for protein motif discovery. Genet Prog Evolv Machines 2004; 5(2): 203-14.
Lei C, Ruan J. A particle swarm optimization algorithm for finding DNA sequence motifs. Proceedings of the IEEE International Conference on Bioinformatics and Biomeidcine Workshops; 2008 Nov 3-5; Philadelphia, USA. IEEE 2008, 166-73
Lei C, Ruan J. A novel swarm intelligence algorithm for finding DNA motifs. Int J Comput Biol Drug Des 2009; 2(4): 323-9.
Abdullah SLS, Harun H. Species motif extraction using LPBS. The Proceedings of the 4th International Conference on Computing and Informatics ICOCI; 2013 Aug 28-30; Sarawak, Malaysia. Universiti Utara Malaysia 2013.
Elewa ES, Abdelhalim MB, Mabrouk MS. An efficient system for finding functional motifs in genomic DNA sequences by using nature- inspired algorithms. In: Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2016. Springer 2016; pp. 215-24
Karaboga D. An idea based on honey bee swarm for numerical optimization Technical report-tr06. Erciyes University, Engineering Faculty, Computer Engineering Department 2005.
González-Álvarez DL, Vega-Rodríguez MA, Gómez-Pulido JA, Sánchez-Pérez JM. Comparing multiobjective artificial bee colony adaptations for discovering DNA motifs. In: Proceedings of the European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics 2012; Springer 2012; pp. 110-21
González-Álvarez DL, Vega-Rodríguez MA. Hybrid multiobjective artificial bee colony with differential evolution applied to motif finding. In: Proceedings of the European Conference on Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics 2013; Springer 2013; pp. 68-79
Blum C. Ant colony optimization: Introduction and recent trends. Phys Life Rev 2005; 2(4): 353-73.
Ochoa A, Hernández A, Cruz L, et al. Artificial societies and social simulation using ant colony, particle swarm optimization and cultural algorithms.In: New achievements in evolutionary computation. InTech 2010; pp. 267-9.
Yang X-S, Deb S. Cuckoo search via Lévy flights. Proceedings of the World Congress on Nature & Biologically Inspired Computing; 2009 Dec 9-11; Coimbatore, India. IEEE 2009, 210-4
Yang X-S, Deb S. Engineering optimisation by cuckoo search. Int J Math Modell NumOptimisat 2010; 1(4): 330-43.
Yang X-S, Deb S. Multiobjective cuckoo search for design optimization. Comput Oper Res 2013; 40(6): 1616-24.
Pavlyukevich I. Lévy flights, non-local search and simulated annealing. J Comput Phys 2007; 226(2): 1830-44.
Kaveh A, Bakhshpoori T, Ashoory M. An efficient optimization procedure based on cuckoo search algorithm for practical design of steel structures. Iran Univ Sci Technol 2012; 2(1): 1-14.
Roy S, Chaudhuri SS. Cuckoo search algorithm using Lévy flight: A review. Int J Modern Edu Comput Sci 2013; 5(12): 10-5.
Yang X-S, Deb S. Cuckoo search: Recent advances and applications. Neural Comput Appl 2014; 24(1): 169-74.
Elewa ES, Abdelhalim M, Mabrouk MS. Adaptation of cuckoo search algorithm for the motif finding problem. Proceedings of the 10th International Computer Engineering Conference (ICENCO); 2014 Dec 29-30; Giza, Egypt, IEEE 2014, 87-91
Hashim F, Mabrouk MS, Al-Atabany W. GWOMF: Grey Wolf Optimization for Motif Finding. Proceedings of the 13th International Computer Engineering Conference (ICENCO); 2017 Dec 27- 28; Cairo, Egypt, IEEE 2017, 141-6
van Helden J, André B, Collado-Vides J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 1998; 281(5): 827-42.
Ma X, Kulkarni A, Zhang Z, Xuan Z, Serfling R, Zhang MQ. A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information. Nucleic Acids Res 2012; gkr1135.
Pavesi G, Mauri G, Pesole G. An algorithm for finding signals of unknown length in DNA sequences. Bioinformatics 2001; 17(Suppl. 1): S207-14.
Eskin E, Pevzner PA. Finding composite regulatory patterns in DNA sequences. Bioinformatics 2002; 18: S354-63.
Evans PA, Smith AD. Toward Optimal Motif Enumeration In: Dehne F, Sack JR, Smid M eds Algorithms and Data Structures WADS 2003 Lecture Notes in Computer Science, vol 2748, Springer, Berlin, Heidelberg
Pisanti N, Carvalho AM, Marsan L, Sagot MF. RISOTTO: Fast Extraction of Motifs with Mismatches. In: Correa JR, Hevia A, Kiwi M. eds. LATIN 2006: Theoretical Informatics. LATIN 2006. Lecture Notes in Computer Science, vol 3887. Springer, Berlin, Heidelberg.
Cazaux B, Rivals E. Reverse engineering of compact suffix trees and links: A novel algorithm. J Discrete Algorithms 2014; 28: 9-22.
Leibovich L, Paz I, Yakhini Z, Mandel-Gutfreund Y. DRIMust: A web server for discovering rank imbalanced motifs using suffix trees. Nucleic Acids Res 2013; 41(W1): W174-9.
Sze S-H, Lu S, Chen J. Integrating sample-driven and patterndriven approaches in motif finding. In: Proceedings of the International Workshop on Algorithms in Bioinformatics; Springer 2004: pp. 438-49
Sun HQ, Low MYH, Hsu WJ, Rajapakse JC. RecMotif: A novel fast algorithm for weak motif discovery. BMC Bioinformatics 2010; 11(11): S8.
Sun HQ, Low MYH, Hsu WJ, Rajapakse JC. ListMotif: A time and memory efficient algorithm for weak motif discovery. Proceedings of the 2010 International Conference on Intelligent Systems and Knowledge Engineering (ISKE); 2010 Nov 15-16; Hangzhou, China, IEEE 2010, 254-60
Sun HQ, Low MYH, Hsu WJ, Tan CW, Rajapakse JC. Tree-structured algorithm for long weak motif discovery. Bioinformatics 2011; 27(19): 2641-7.
Yang X, Rajapakse JC. Graphical approach to weak motif recognition. Genome Inf 2004; 15(2): 52-62.
Ho LS, Rajapakse JC. Graphical approach to weak motif recognition in noisy data sets. In: Proceedings of the International Workshop on Pattern Recognition in Bioinformatics; Springer 2006: pp. 23-31
Chin FYL, Leung HCM. Voting algorithms for discovering long motifs. Proceedings of the 3rd Asia-Pacific Bioinformatics Conference; 2005 Jan 17-21; Singapore. Series on Advances in Bioinformatics and Computational Biology 2005, pp. 261-71
Rajasekaran S, Balla S, Huang C-H. Exact algorithms for planted motif problems. J Comput Biol 2005; 12(8): 1117-28.
Sze S-H, Zhao X. Improved pattern-driven algorithms for motif finding in DNA sequences.In: Systems Biology and Regulatory Genomics. Springer 2007; pp. 198-211.
Davila J, Balla S, Rajasekaran S. Space and time efficient algorithms for planted motif search. In: Proceedings of the International Conference on Computational Science; Springer 2006; pp. 822-9.
Kuksa PP, Pavlovic V. Efficient motif finding algorithms for large-alphabet inputs. BMC Bioinformatics 2010; 11(8): S1.
Rajasekaran S, Dinh H. A speedup technique for (l, d)-motif finding algorithms. BMC Res Notes 2011; 4(1): 54.
Dinh H, Rajasekaran S, Kundeti VK. PMS5: An efficient exact algorithm for the (ℓ, d)-motif finding problem. BMC Bioinformatics 2011; 12(1): 410.
Bandyopadhyay S, Sahni S, Rajasekaran S. PMS6: A fast algorithm for motif discovery Int J Bioinf Res Appl 2 2014; 10(4-5): 369-83
Yu Q, Huo H, Zhang Y, Guo H. PairMotif: A new pattern-driven algorithm for planted (l, d) DNA motif search. PLoS One 2012; 7(10): e48442.
Ho ES, Jakubowski CD, Gunderson SI. iTriplet, a rule-based nucleic acid sequence motif finder. Algor Mol Biol 2009; 4(1): 14.
Davila J, Balla S, Rajasekaran S. Fast and practical algorithms for planted (l, d) motif search. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 2007; 4(4): 544-52.
Davila J, Balla S, Rajasekaran S. Pampa: An improved branch and bound algorithm for planted (l, d) motif search. In: Tech. rep, ed 2007.
Sharma D, Rajasekaran S. A simple algorithm for (l, d) motif search1. CIBCB'09 Proceedings of the 6th Annual IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology; 2009 March 30-April 02; Tennessee, USA. Piscataway, USA: IEEE 2009; pp. 148-54.
Chen Z-Z, Wang L. Fast exact algorithms for the closest string and substring problems with application to the planted (l, d)-motif model. IEEE/ACM Trans Comput Biol Bioinf 2011; 8(5): 1400-10.
Dinh H, Rajasekaran S, Davila J. qPMS7: A fast algorithm for finding (ℓ, d)-motifs in DNA and protein sequences. PLoS One 2012; 7(7): e41425.
Tanaka S. Improved exact enumerative algorithms for the planted (l, d)-motif search problem. IEEE/ACM Trans Comput Biol Bioinf 2014; 11(2): 361-74.
Keich U, Pevzner PA. Finding motifs in the twilight zone. Bioinformatics 2002; 18(10): 1374-81.
Price A, Ramabhadran S, Pevzner PA. Finding subtle motifs by branching from sample strings. Bioinformatics 2003; 19(Suppl. 2): ii149-55.
Sun C, Huo H, Yu Q, Guo H, Sun Z. An affinity propagation-based DNA motif discovery algorithm. BioMed Res Int 2015; 2015: 10.
Wu H, Wong PW, Caddick MX, Sibthorp C. Finding DNA regulatory motifs with position-dependent models. J Med Bioeng 2013; 2(2): 103-9.
Thijs G, Marchal K, Lescot M, et al. A Gibbs sampling method to detect overrepresented motifs in the upstream regions of coexpressed genes. J Comput Biol 2002; 9(2): 447-64.
Kilpatrick AM, Ward B, Aitken S. Stochastic EM-based TFBS motif discovery with MITSU. Bioinformatics 2014; 30(12): i310-8.
Bi C. A Monte Carlo EM algorithm for de novo motif discovery in biomolecular sequences. IEEE/ACM Trans Comput Biol Bioinf 2009; 6(3): 370-86.
Bi C. SEAM: A stochastic EM-type algorithm for motif-finding in biopolymer sequences. J Bioinform Comput Biol 2007; 5(01): 47-77.
Miller AK, Nielsen PM, Crampin EJ. A Bayesian search for transcriptional motifs. PLoS One 2010; 5(11): e13897.
Li SM, Wakefield J, Self S. A transdimensional Bayesian model for pattern recognition in DNA sequences. Biostatistics 2008; 9(4): 668-85.
Fratkin E, Naughton BT, Brutlag DL, Batzoglou S. MotifCut: Regulatory motifs finding with maximum density subgraphs. Bioinformatics 2006; 22(14): e150-7.
Boucher C, Brown DG, Church P. A graph clustering approach to weak motif recognition. In: Proceedings of the International Workshop on Algorithms in Bioinformatics; Springer 2007, 149- 160
Hertz GZ, Hartzell GW, Stormo GD. Identification of consensus patterns in unaligned DNA sequences known to be functionally related. Comput Appl Biosci 1990; 6(2): 81-92.
Huang C-W, Lee W-S, Hsieh S-Y. An improved heuristic algorithm for finding motif signals in DNA sequences IEEE/ACM Trans Comput Biol Bioinform 2011; 8(4): 959-75
Stine M, Dasgupta D, Mukatira S. Motif discovery in upstream sequences of coordinately expressed genes. Proceedings of the Evolutionary Computation, 2003. CEC'03. The 2003 Congress on; 2003 Dec 8-12; Canberra, Australia. IEEE 2003, 1596-603
Congdon CB, Fizer CW, Smith NW, et al. Preliminary results for GAMI: A genetic algorithms approach to motif inference. Proceedings of the 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology; 2005 Nov 15; La Jolla, USA. IEEE 2005, 1-8
Liu Y, Liu XS, Wei L, Altman RB, Batzoglou S. Eukaryotic regulatory element conservation analysis and identification using comparative genomics. Genome Res 2004; 14(3): 451-8.
Zare-Mirakabad F, Ahrabian H, Sadeghi M, Hashemifar S, Nowzari-Dalini A, Goliaei B. Genetic algorithm for dyad pattern finding in DNA sequences. Genes Genet Syst 2009; 84(1): 81-93.
Bi C. A genetic-based EM motif-finding algorithm for biological sequence analysis. Proceedings of the CIBCB'07. IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology; 2007 April 1-5; Honolulu, USA. IEEE 2007, pp. 275-82
Wang X, Song T, Wang Z, Su Y, Liu X. MRPGA: Motif detecting by modified random projection strategy and genetic algorithm. J Comput Theor Nanosci 2013; 10(5): 1209-14.
Sheng X, Wang K. Motif identification method based on Gibbs sampling and genetic algorithm. Cluster Comput 2016; 20(1): 1-9.
Li X, Wang D. An improved genetic algorithm for DNA motif discovery with public domain information. Adv Neuro-Inf Process 2009; pp. 521-8.
Kaya M. MOGAMOD: Multi-objective genetic algorithm for motif discovery. Expert Syst Appl 2009; 36(2): 1039-47.
Zare-Mirakabad F, Ahrabian H, Sadeghi M, et al. PSOMF: An algorithm for pattern discovery using PSO. Proceedings of the Third IAPR International Conferences on Pattern Recognition in Bioinformatics; 2008 Oct 15-17; Melbourne, Australia. Springer 2008, 61-72
Verma RS, Singh V, Kumar S. Dna sequence assembly using particle swarm optimization. Int J Comput Appl 2011; 28(10)
Karabulut M, Ibrikci T. A Bayesian Scoring Scheme based Particle Swarm Optimization algorithm to identify transcription factor binding sites. Appl Soft Comput 2012; 12(9): 2846-55.
Akbari R, Zeighami V, Ziarati K, Akbari I. Development of an efficient hybrid method for motif discovery in DNA sequences. AUT J Elect Eng 2012; 44(1): 63-75.
Bouamama S, Boukerram A, Al-Badarneh AF. Motif finding using ant colony optimization. In: Proceedings of the ANTS Conference; Springer 2010; pp. 464-71
Yang C-H, Liu Y-T, Chuang L-Y. DNA motif discovery based on ant colony optimization and expectation maximization. Proceedings of the International Multi Conference of Engineers and Computer Scientists; 2011 March 14-16; Hong Kong. Citeseer 2011, 169-74
Makolo A, Osofisan A, Adebiyi E. Comparative analysis of similarity check mechanism for motif extraction. African J Comput Sci 2012; 5(1): 53-8.
Liu XS, Brutlag DL, Liu JS. An algorithm for finding protein–DNA binding sites with applications to chromatin-immuno-precipitation microarray experiments. Nat Biotechnol 2002; 20(8): 835-9.
Mendes ND, Casimiro AC, Santos PM, Sá-Correia I, Oliveira AL, Freitas AT. MUSA: A parameter free algorithm for the identification of biologically significant motifs. Bioinformatics 2006; 22(24): 2996-3002.
Hu J, Yang YD, Kihara D. EMD: An ensemble algorithm for discovering regulatory motifs in DNA sequences. BMC Bioinf 2006; 7(1): 7: 342.
Bussemaker HJ, Li H, Siggia ED. Building a dictionary for genomes: identification of presumptive regulatory sites by statistical analysis. Proc Natl Acad Sci USA 2000; 97(18): 10096-100.
Wang G, Yu T, Zhang W. WordSpy: Identifying transcription factor binding motifs by building a dictionary and learning a grammar. Nucleic Acids Res 2005; 33(Suppl. 2): W412-6.
Rouchka EC, Hardin CT. rMotifGen: Random motif generator for DNA and protein sequences. BMC Bioinformatics 2007; 8(1): 292-10.
Ponty Y, Termier M, Denise A. GenRGenS: Software for generating random genomic sequences and structures. Bioinformatics 2006; 22(12): 1534-5.
Pavesi G, Zambelli F, Pesole G, Weeder H. An algorithm for finding conserved regulatory motifs and regions in homologous sequences. BMC Bioinformatics 2007; 8(1): 46.
Li L. Graphic network based methods in discovering TFBS motifs PhD dissertation The Ohio State University, 2012.
Boucher C. Combinatorial and probabilistic approaches to motif recognition PhD dissertation University of Waterloo, 2010.
Lones M, Tyrrell A. Regulatory motif discovery using a population clustering evolutionary algorithm IEEE/ACM Trans Computl Biol Bioinf 2007 4(3): 403-14.
Stormo GD, Hartzell GW. Identifying protein-binding sites from unaligned DNA fragments. Proc Natl Acad Sci USA 1989; 86(4): 1183-7.
Zhu J, Zhang MQ. SCPD: A promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 1999; 15(7): 607-11.
Martínez-Arellano G, Brizuela CA. Comparison of simple encoding schemes in GA’s for the motif finding problem: Preliminary results. In: Proceedings of the Brazilian Symposium on Bioinformatics; Springer 2007: 22-33
Tompa M, Li N, Bailey TL, et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005; 23(1): 137-44.
Chan T-M, Leung K-S, Lee K-H. TFBS identification by positionand consensus-led genetic algorithm with local filtering. Proceedings of the 9th annual conference on genetic and evolutionary computation; 2007 July 7-11; London, England. ACM 2007, 377- 84.
Kumar B, Kumar D. A review on Artificial Bee Colony algorithm. Int J Eng Technol 2013; 2(3): 175-86.
González-Álvarez DL, Vega-Rodríguez MA, Gómez-Pulido JA, Sánchez-Pérez JM. Solving the motif discovery problem by using differential evolution with pareto tournaments. Proceedings of the 2010 IEEE Congress on Evolutionary Computation (CEC); 2010 July 1-8; Barcelona, Spain. IEEE 2010
González-Álvarez DL, Vega-Rodríguez MA, Pulido JAG, Sánchez-Pérez JM. Finding motifs in DNA sequences applying a Multiobjective Artificial Bee Colony (MOABC) algorithm In: Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics Springer, Berlin 2011; pp 89-100
Trelea IC. The particle swarm optimization algorithm: Convergence analysis and parameter selection. Inf Process Lett 2003; 85(6): 317-25.
Hassan R, Cohanim B, De Weck O, Venter G. A comparison of particle swarm optimization and the genetic algorithm. Proceedings of the 46th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference; 2005 Apr 18-21; Austin, USA: AIAA 2005
Li M, Du W, Nian F. An adaptive particle swarm optimization algorithm based on directed weighted complex network. Math Probl Eng 2014; 2014: 1-6.

Rights & PermissionsPrintExport Cite as

Article Details

Year: 2019
Page: [4 - 26]
Pages: 23
DOI: 10.2174/1573394714666180417161728
Price: $65

Article Metrics

PDF: 68