An Overview of Computational Tools of Nucleic Acid Binding Site Prediction for Site-specific Proteins and Nucleases

Author(s): Hua Wan, Jian-ming Li, Huang Ding, Shuo-xin Lin, Shu-qin Tu, Xu-hong Tian, Jian-ping Hu*, Shan Chang*.

Journal Name: Protein & Peptide Letters

Volume 27 , Issue 5 , 2020

Become EABM
Become Reviewer

Graphical Abstract:


Abstract:

Understanding the interaction mechanism of proteins and nucleic acids is one of the most fundamental problems for genome editing with engineered nucleases. Due to some limitations of experimental investigations, computational methods have played an important role in obtaining the knowledge of protein-nucleic acid interaction. Over the past few years, dozens of computational tools have been used for identification of nucleic acid binding site for site-specific proteins and design of site-specific nucleases because of their significant advantages in genome editing. Here, we review existing widely-used computational tools for target prediction of site-specific proteins as well as off-target prediction of site-specific nucleases. This article provides a list of on-line prediction tools according to their features followed by the description of computational methods used by these tools, which range from various sequence mapping algorithms (like Bowtie, FetchGWI and BLAST) to different machine learning methods (such as Support Vector Machine, hidden Markov models, Random Forest, elastic network and deep neural networks). We also make suggestions on the further development in improving the accuracy of prediction methods. This survey will provide a reference guide for computational biologists working in the field of genome editing.

Keywords: Site-specific protein, engineered nuclease, target prediction, machine learning, genome editing, protein nucleic acid interaction.

[1]
Gaj, T.; Gersbach, C.A.; Barbas, C.F. III ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotechnol., 2013, 31(7), 397-405.
[http://dx.doi.org/10.1016/j.tibtech.2013.04.004] [PMID: 23664777]
[2]
Rouet, P.; Smih, F.; Jasin, M. Introduction of double-strand breaks into the genome of mouse cells by expression of a rare-cutting endonuclease. Mol. Cell. Biol., 1994, 14(12), 8096-8106.
[http://dx.doi.org/10.1128/MCB.14.12.8096] [PMID: 7969147]
[3]
Langelier, M.F.; Planck, J.L.; Roy, S.; Pascal, J.M. Crystal structures of poly(ADP-ribose) polymerase-1 (PARP-1) zinc fingers bound to DNA: Structural and functional insights into DNA-dependent PARP-1 activity. J. Biol. Chem., 2011, 286(12), 10690-10701.
[http://dx.doi.org/10.1074/jbc.M110.202507] [PMID: 21233213]
[4]
Pavletich, N.P.; Pabo, C.O. Zinc finger-DNA recognition: Crystal structure of a Zif268-DNA complex at 2.1 A. Science, 1991, 252(5007), 809-817.
[http://dx.doi.org/10.1126/science.2028256] [PMID: 2028256]
[5]
Deng, D.; Yan, C.; Pan, X.; Mahfouz, M.; Wang, J.; Zhu, J.K.; Shi, Y.; Yan, N. Structural basis for sequence-specific recognition of DNA by TAL effectors. Science, 2012, 335(6069), 720-723.
[http://dx.doi.org/10.1126/science.1215670] [PMID: 22223738]
[6]
Mak, A.N.S.; Bradley, P.; Cernadas, R.A.; Bogdanove, A.J.; Stoddard, B.L. The crystal structure of TAL effector PthXo1 bound to its DNA target. Science, 2012, 335(6069), 716-719.
[http://dx.doi.org/10.1126/science.1216211] [PMID: 22223736]
[7]
van der Oost, J.; Westra, E.R.; Jackson, R.N.; Wiedenheft, B. Unravelling the structural and mechanistic basis of CRISPR-Cas systems. Nat. Rev. Microbiol., 2014, 12(7), 479-492.
[http://dx.doi.org/10.1038/nrmicro3279] [PMID: 24909109]
[8]
Wang, J.; Li, J.; Zhao, H.; Sheng, G.; Wang, M.; Yin, M.; Wang, Y. Structural and mechanistic basis of PAM-dependent spacer acquisition in CRISPR-Cas systems. Cell, 2015, 163(4), 840-853.
[http://dx.doi.org/10.1016/j.cell.2015.10.008] [PMID: 26478180]
[9]
Murakami, M.T.; Sforça, M.L.; Neves, J.L.; Paiva, J.H.; Domingues, M.N.; Pereira, A.L.A.; Zeri, A.C.D.M.; Benedetti, C.E. The repeat domain of the type III effector protein PthA shows a TPR-like structure and undergoes conformational changes upon DNA interaction. Proteins, 2010, 78(16), 3386-3395.
[http://dx.doi.org/10.1002/prot.22846] [PMID: 20848643]
[10]
Hu, H.; Zhu, C.; Ai, H.; Zhang, L.; Zhao, J.; Zhao, Q.; Liu, H. LPI-ETSLP: lncRNA-protein interaction prediction using eigenvalue transformation-based semi-supervised link prediction. Mol. Biosyst., 2017, 13(9), 1781-1787.
[http://dx.doi.org/10.1039/C7MB00290D] [PMID: 28702594]
[11]
Hu, H.; Zhang, L.; Ai, H.; Zhang, H.; Fan, Y.; Zhao, Q.; Liu, H. HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy. RNA Biol., 2018, 15(6), 797-806.
[http://dx.doi.org/10.1080/15476286.2018.1457935] [PMID: 29583068]
[12]
Zhao, Q.; Liang, D.; Hu, H.; Ren, G.; Liu, H. RWLPAP: Random Walk for IncRNA-Protein Associations Prediction. Protein Pept. Lett., 2018, 25(9), 830-837.
[http://dx.doi.org/10.2174/0929866525666180905104904] [PMID: 30182833]
[13]
Zhao, Q.; Yu, H.; Ming, Z.; Hu, H.; Ren, G.; Liu, H. The bipartite network projection-recommended algorithm for predicting long non-coding RNA-protein interactions. Mol. Ther. Nucleic Acids, 2018, 13, 464-471.
[http://dx.doi.org/10.1016/j.omtn.2018.09.020] [PMID: 30388620]
[14]
Zhao, Q.; Zhang, Y.; Hu, H.; Ren, G.; Zhang, W.; Liu, H. IRWNRLPI: Integrating random walk and neighborhood regularized logistic matrix factorization for lncRNA-protein interaction prediction. Front. Genet., 2018, 9, 239.
[http://dx.doi.org/10.3389/fgene.2018.00239] [PMID: 30023002]
[15]
Chen, X.; Yan, C.C.; Zhang, X.; Zhang, X.; Dai, F.; Yin, J.; Zhang, Y. Drug-target interaction prediction: Databases, web servers and computational models. Brief. Bioinform., 2016, 17(4), 696-712.
[http://dx.doi.org/10.1093/bib/bbv066] [PMID: 26283676]
[16]
Chen, X.; Yan, G.Y. Novel human lncRNA-disease association inference based on lncRNA expression profiles. Bioinformatics, 2013, 29(20), 2617-2624.
[http://dx.doi.org/10.1093/bioinformatics/btt426] [PMID: 24002109]
[17]
Chen, X.; Ren, B.; Chen, M.; Wang, Q.; Zhang, L.; Yan, G. NLLSS: Predicting synergistic drug combinations based on semi-supervised learning. PLOS Comput. Biol., 2016, 12(7), e1004975
[http://dx.doi.org/10.1371/journal.pcbi.1004975] [PMID: 27415801]
[18]
Chen, X.; Huang, L. LRSSLMDA: laplacian regularized sparse subspace learning for miRNA-disease association prediction. PLOS Comput. Biol., 2017, 13(12), e1005912
[http://dx.doi.org/10.1371/journal.pcbi.1005912] [PMID: 29253885]
[19]
Chen, X.; Huang, Y.A.; You, Z.H.; Yan, G.Y.; Wang, X.S. A novel approach based on KATZ measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics, 2017, 33(5), 733-739.
[PMID: 28025197]
[20]
Chen, X.; Yan, C.C.; Zhang, X.; You, Z.H. Long non-coding RNAs and complex diseases: From experimental results to computational models. Brief. Bioinform., 2017, 18(4), 558-576.
[PMID: 27345524]
[21]
You, Z.H.; Huang, Z.A.; Zhu, Z.; Yan, G.Y.; Li, Z.W.; Wen, Z.; Chen, X. PBMDA: A novel and effective path-based computational model for miRNA-disease association prediction. PLOS Comput. Biol., 2017, 13(3), e1005455
[http://dx.doi.org/10.1371/journal.pcbi.1005455] [PMID: 28339468]
[22]
Chen, X.; Wang, L.; Qu, J.; Guan, N.N.; Li, J.Q. Predicting miRNA-disease association based on inductive matrix completion. Bioinformatics, 2018, 34(24), 4256-4265.
[http://dx.doi.org/10.1093/bioinformatics/bty503] [PMID: 29939227]
[23]
Chen, X.; Xie, D.; Wang, L.; Zhao, Q.; You, Z.H.; Liu, H. BNPMDA: Bipartite Network Projection for miRNA-Disease Association prediction. Bioinformatics, 2018, 34(18), 3178-3186.
[http://dx.doi.org/10.1093/bioinformatics/bty333] [PMID: 29701758]
[24]
Chen, X.; Yin, J.; Qu, J.; Huang, L. MDHGI: Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction. PLOS Comput. Biol., 2018, 14(8), e1006418
[http://dx.doi.org/10.1371/journal.pcbi.1006418] [PMID: 30142158]
[25]
Haft, D.H.; Selengut, J.; Mongodin, E.F.; Nelson, K.E. A guild of 45 CRISPR-associated (Cas) protein families and multiple CRISPR/Cas subtypes exist in prokaryotic genomes. PLOS Comput. Biol., 2005, 1(6), e60
[http://dx.doi.org/10.1371/journal.pcbi.0010060] [PMID: 16292354]
[26]
Krishna, S.S.; Majumdar, I.; Grishin, N.V. Structural classification of zinc fingers: Survey and summary. Nucleic Acids Res., 2003, 31(2), 532-550.
[http://dx.doi.org/10.1093/nar/gkg161] [PMID: 12527760]
[27]
Pérez-Quintero, A.L.; Lamy, L.; Gordon, J.L.; Escalon, A.; Cunnac, S.; Szurek, B.; Gagnevin, L. QueTAL: A suite of tools to classify and compare TAL effectors functionally and phylogenetically. Front. Plant Sci., 2015, 6, 545.
[http://dx.doi.org/10.3389/fpls.2015.00545] [PMID: 26284082]
[28]
Bradley, P. Structural modeling of TAL effector-DNA interactions. Protein Sci., 2012, 21(4), 471-474.
[http://dx.doi.org/10.1002/pro.2034] [PMID: 22334576]
[29]
Wan, H.; Hu, J.P.; Li, K.S.; Tian, X.H.; Chang, S. Molecular dynamics simulations of DNA-free and DNA-bound TAL effectors. PLoS One, 2013, 8(10), e76045
[http://dx.doi.org/10.1371/journal.pone.0076045] [PMID: 24130757]
[30]
Wan, H.; Li, J.; Chang, S.; Lin, S.; Tian, Y.; Tian, X.; Wang, M.; Hu, J. Probing the behaviour of Cas1-Cas2 upon Protospacer binding in CRISPR-Cas systems using molecular dynamics simulations. Sci. Rep., 2019, 9(1), 3188.
[http://dx.doi.org/10.1038/s41598-019-39616-1] [PMID: 30816277]
[31]
Fu, Y.; Foden, J.A.; Khayter, C.; Maeder, M.L.; Reyon, D.; Joung, J.K.; Sander, J.D. High-frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat. Biotechnol., 2013, 31(9), 822-826.
[http://dx.doi.org/10.1038/nbt.2623] [PMID: 23792628]
[32]
Cho, S.W.; Kim, S.; Kim, Y.; Kweon, J.; Kim, H.S.; Bae, S.; Kim, J.S. Analysis of off-target effects of CRISPR/Cas-derived RNA-guided endonucleases and nickases. Genome Res., 2014, 24(1), 132-141.
[http://dx.doi.org/10.1101/gr.162339.113] [PMID: 24253446]
[33]
Beerli, R.R.; Barbas, C.F. Engineering polydactyl zinc-finger transcription factors. Nat. Biotechnol., 2002, 20(2), 135-141.
[http://dx.doi.org/10.1038/nbt0202-135] [PMID: 11821858]
[34]
Bitinaite, J.; Wah, D.A.; Aggarwal, A.K.; Schildkraut, I. FokI dimerization is required for DNA cleavage. Proc. Natl. Acad. Sci. USA, 1998, 95(18), 10570-10575.
[http://dx.doi.org/10.1073/pnas.95.18.10570] [PMID: 9724744]
[35]
Gu, K.; Yang, B.; Tian, D.; Wu, L.; Wang, D.; Sreekala, C.; Yang, F.; Chu, Z.; Wang, G.L.; White, F.F.; Yin, Z. R gene expression induced by a type-III effector triggers disease resistance in rice. Nature, 2005, 435(7045), 1122-1125.
[http://dx.doi.org/10.1038/nature03630] [PMID: 15973413]
[36]
Boch, J.; Bonas, U. Xanthomonas AvrBs3 family-type III effectors: Discovery and function. Annu. Rev. Phytopathol., 2010, 48, 419-436.
[http://dx.doi.org/10.1146/annurev-phyto-080508-081936] [PMID: 19400638]
[37]
Moscou, M.J.; Bogdanove, A.J. A simple cipher governs DNA recognition by TAL effectors. Science, 2009, 326(5959), 1501-1501.
[http://dx.doi.org/10.1126/science.1178817] [PMID: 19933106]
[38]
Boch, J.; Scholze, H.; Schornack, S.; Landgraf, A.; Hahn, S.; Kay, S.; Lahaye, T.; Nickstadt, A.; Bonas, U. Breaking the code of DNA binding specificity of TAL-type III effectors. Science, 2009, 326(5959), 1509-1512.
[http://dx.doi.org/10.1126/science.1178811] [PMID: 19933107]
[39]
Christian, M.; Cermak, T.; Doyle, E.L.; Schmidt, C.; Zhang, F.; Hummel, A.; Bogdanove, A.J.; Voytas, D.F. Targeting DNA double-strand breaks with TAL effector nucleases. Genetics, 2010, 186(2), 757-761.
[http://dx.doi.org/10.1534/genetics.110.120717] [PMID: 20660643]
[40]
Mussolino, C.; Morbitzer, R.; Lütge, F.; Dannemann, N.; Lahaye, T.; Cathomen, T. A novel TALE nuclease scaffold enables high genome editing activity in combination with low toxicity. Nucleic Acids Res., 2011, 39(21), 9283-9293.
[http://dx.doi.org/10.1093/nar/gkr597] [PMID: 21813459]
[41]
Sorek, R.; Lawrence, C.M.; Wiedenheft, B. CRISPR-mediated adaptive immune systems in bacteria and archaea. Annu. Rev. Biochem., 2013, 82, 237-266.
[http://dx.doi.org/10.1146/annurev-biochem-072911-172315] [PMID: 23495939]
[42]
Gasiunas, G.; Barrangou, R.; Horvath, P.; Siksnys, V. Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. USA, 2012, 109(39), E2579-E2586.
[http://dx.doi.org/10.1073/pnas.1208507109] [PMID: 22949671]
[43]
Makarova, K.S.; Haft, D.H.; Barrangou, R.; Brouns, S.J.J.; Charpentier, E.; Horvath, P.; Moineau, S.; Mojica, F.J.M.; Wolf, Y.I.; Yakunin, A.F.; van der Oost, J.; Koonin, E.V. Evolution and classification of the CRISPR-Cas systems. Nat. Rev. Microbiol., 2011, 9(6), 467-477.
[http://dx.doi.org/10.1038/nrmicro2577] [PMID: 21552286]
[44]
Jinek, M.; Chylinski, K.; Fonfara, I.; Hauer, M.; Doudna, J.A.; Charpentier, E. A programmable dual-RNA-guided DNA endo-nuclease in adaptive bacterial immunity. Science, 2012, 337(6096), 816-821.
[http://dx.doi.org/10.1126/science.1225829] [PMID: 22745249]
[45]
Jayakanthan, M.; Muthukumaran, J.; Chandrasekar, S.; Chawla, K.; Punetha, A.; Sundar, D. ZifBASE: a database of zinc finger proteins and associated resources. BMC Genomics, 2009, 10, 421.
[http://dx.doi.org/10.1186/1471-2164-10-421] [PMID: 19737425]
[46]
Kim, Y.; Kweon, J.; Kim, A.; Chon, J.K.; Yoo, J.Y.; Kim, H.J.; Kim, S.; Lee, C.; Jeong, E.; Chung, E.; Kim, D.; Lee, M.S.; Go, E.M.; Song, H.J.; Kim, H.; Cho, N.; Bang, D.; Kim, S.; Kim, J.S. A library of TAL effector nucleases spanning the human genome. Nat. Biotechnol., 2013, 31(3), 251-258.
[http://dx.doi.org/10.1038/nbt.2517] [PMID: 23417094]
[47]
Kaur, K.; Tandon, H.; Gupta, A.K.; Kumar, M.; Crispr, G.E. A central hub of CRISPR/Cas-based genome editing. Database (Oxford), 2015, 2015, bav055
[http://dx.doi.org/10.1093/database/bav055] [PMID: 26120138]
[48]
Iseli, C.; Ambrosini, G.; Bucher, P.; Jongeneel, C.V. Indexing strategies for rapid searches of short words in genome sequences. PLoS One, 2007, 2(6), e579
[http://dx.doi.org/10.1371/journal.pone.0000579] [PMID: 17593978]
[49]
Langmead, B.; Salzberg, S.L. Fast gapped-read alignment with Bowtie 2. Nat. Methods, 2012, 9(4), 357-359.
[http://dx.doi.org/10.1038/nmeth.1923] [PMID: 22388286]
[50]
Altschul, S.F.; Madden, T.L.; Schäffer, A.A.; Zhang, J.; Zhang, Z.; Miller, W.; Lipman, D.J. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res., 1997, 25(17), 3389-3402.
[http://dx.doi.org/10.1093/nar/25.17.3389] [PMID: 9254694]
[51]
Furey, T.S.; Cristianini, N.; Duffy, N.; Bednarski, D.W.; Schummer, M.; Haussler, D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 2000, 16(10), 906-914.
[http://dx.doi.org/10.1093/bioinformatics/16.10.906] [PMID: 11120680]
[52]
Johnson, L.S.; Eddy, S.R.; Portugaly, E. Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics, 2010, 11, 431.
[http://dx.doi.org/10.1186/1471-2105-11-431] [PMID: 20718988]
[53]
Finn, R.D.; Clements, J.; Eddy, S.R. HMMER web server: Interactive sequence similarity searching. Nucleic Acids Res., 2011, 39(Web Server issue), W29-37.
[http://dx.doi.org/10.1093/nar/gkr367] [PMID: 21593126]
[54]
Breiman, L. Random forests. Mach. Learn., 2001, 45, 5-32.
[http://dx.doi.org/10.1023/A:1010933404324]
[55]
Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. B, 2005, 67, 301-320.
[http://dx.doi.org/10.1111/j.1467-9868.2005.00503.x]
[56]
Xie, X.; Wu, S.; Lam, K.M.; Yan, H. PromoterExplorer: An effective promoter identification method based on the AdaBoost algorithm. Bioinformatics, 2006, 22(22), 2722-2728.
[http://dx.doi.org/10.1093/bioinformatics/btl482] [PMID: 17000749]
[57]
Wang, Z.; Wang, Y.; Xuan, J.; Dong, Y.; Bakay, M.; Feng, Y.; Clarke, R.; Hoffman, E.P. Optimized multilayer perceptrons for molecular classification and diagnosis using genomic data. Bioinformatics, 2006, 22(6), 755-761.
[http://dx.doi.org/10.1093/bioinformatics/btk036] [PMID: 16403791]
[58]
Kim, I.H.; Feng, C.C.; Wang, Y.C. A simplified linear feature matching method using decision tree analysis, weighted linear directional mean, and topological relationships. Int. J. Geogr. Inf. Sci., 2017, 31, 1042-1060.
[http://dx.doi.org/10.1080/13658816.2016.1267736]
[59]
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: A new learning scheme of feedforward neural networks. Proceedings of International Joint Conference on Neural Networks, 2004, pp. 985-990.
[60]
Hou, J.; Adhikari, B.; Cheng, J.; Deep, S.F. Deep convolutional neural network for mapping protein sequences to folds. Bioinformatics, 2018, 34(8), 1295-1303.
[http://dx.doi.org/10.1093/bioinformatics/btx780] [PMID: 29228193]
[61]
Mandell, J.G.; Barbas, C.F. III Zinc Finger Tools: Custom DNAbinding domains for transcription factors and nucleases. Nucleic Acids Res., 2006, 34(Web Server issue), W516-523.
[http://dx.doi.org/10.1093/nar/gkl209] [PMID: 16845061]
[62]
Persikov, A.V.; Osada, R.; Singh, M. Predicting DNA recognition by Cys2His2 zinc finger proteins. Bioinformatics, 2009, 25(1), 22-29.
[http://dx.doi.org/10.1093/bioinformatics/btn580] [PMID: 19008249]
[63]
Sander, J.D.; Maeder, M.L.; Reyon, D.; Voytas, D.F.; Joung, J.K.; Dobbs, D. ZiFiT (Zinc Finger Targeter): An updated zinc finger engineering tool. Nucleic Acids Res., 2010, 38(Web Server issue), W462-8.
[http://dx.doi.org/10.1093/nar/gkq319] [PMID: 20435679]
[64]
Cradick, T.J.; Ambrosini, G.; Iseli, C.; Bucher, P.; McCaffrey, A.P. ZFN-site searches genomes for zinc finger nuclease target sites and off-target sites. BMC Bioinformatics, 2011, 12, 152.
[http://dx.doi.org/10.1186/1471-2105-12-152] [PMID: 21569489]
[65]
Doyle, E.L.; Booher, N.J.; Standage, D.S.; Voytas, D.F.; Brendel, V.P.; Vandyk, J.K.; Bogdanove, A.J. TAL Effector-Nucleotide Targeter (TALE-NT) 2.0: Tools for TAL effector design and target prediction. Nucleic Acids Res., 2012, 40(Web Server issue), W117-22.
[http://dx.doi.org/10.1093/nar/gks608] [PMID: 22693217]
[66]
Pérez-Quintero, A.L.; Rodriguez-R, L.M.; Dereeper, A.; López, C.; Koebnik, R.; Szurek, B.; Cunnac, S. An improved method for TAL effectors DNA-binding sites prediction reveals functional convergence in TAL repertoires of Xanthomonas oryzae strains. PLoS One, 2013, 8(7), e68464
[http://dx.doi.org/10.1371/journal.pone.0068464] [PMID: 23869221]
[67]
Grau, J.; Wolf, A.; Reschke, M.; Bonas, U.; Posch, S.; Boch, J. Computational predictions provide insights into the biology of TAL effector target sites. PLOS Comput. Biol., 2013, 9(3), e1002962
[http://dx.doi.org/10.1371/journal.pcbi.1002962] [PMID: 23526890]
[68]
Grau, J.; Boch, J.; Posch, S. TALENoffer: genome-wide TALEN off-target prediction. Bioinformatics, 2013, 29(22), 2931-2932.
[http://dx.doi.org/10.1093/bioinformatics/btt501] [PMID: 23995255]
[69]
Heigwer, F.; Kerr, G.; Walther, N.; Glaeser, K.; Pelz, O.; Breinig, M.; Boutros, M. E-TALEN: A web tool to design TALENs for genome engineering. Nucleic Acids Res., 2013, 41(20), e190
[http://dx.doi.org/10.1093/nar/gkt789] [PMID: 24003033]
[70]
Fine, E.J.; Cradick, T.J.; Zhao, C.L.; Lin, Y.; Bao, G. An online bioinformatics tool predicts zinc finger and TALE nuclease off-target cleavage. Nucleic Acids Res., 2014, 42(6), e42
[http://dx.doi.org/10.1093/nar/gkt1326] [PMID: 24381193]
[71]
Ma, M.; Ye, A.Y.; Zheng, W.; Kong, L. A guide RNA sequence design platform for the CRISPR/Cas9 system for model organism genomes. BioMed Res. Int., 2013, 2013. 270805
[http://dx.doi.org/10.1155/2013/270805] [PMID: 24199189]
[72]
Bae, S.; Park, J.; Kim, J.S. Cas-OFFinder: A fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics, 2014, 30(10), 1473-1475.
[http://dx.doi.org/10.1093/bioinformatics/btu048] [PMID: 24463181]
[73]
Chari, R.; Mali, P.; Moosburner, M.; Church, G.M. Unraveling CRISPR-Cas9 genome engineering parameters via a library-on-library approach. Nat. Methods, 2015, 12(9), 823-826.
[http://dx.doi.org/10.1038/nmeth.3473] [PMID: 26167643]
[74]
Chari, R.; Yeo, N.C.; Chavez, A.; Church, G.M. sgRNA Scorer 2.0: A species-independent model to predict CRISPR/Cas9 activity. ACS Synth. Biol., 2017, 6(5), 902-904.
[http://dx.doi.org/10.1021/acssynbio.6b00343] [PMID: 28146356]
[75]
Abadi, S.; Yan, W.X.; Amar, D.; Mayrose, I. A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action. PLOS Comput. Biol., 2017, 13(10), e1005807
[http://dx.doi.org/10.1371/journal.pcbi.1005807] [PMID: 29036168]
[76]
Kuan, P.F.; Powers, S.; He, S.; Li, K.; Zhao, X.; Huang, B. A systematic evaluation of nucleotide properties for CRISPR sgRNA design. BMC Bioinformatics, 2017, 18(1), 297.
[http://dx.doi.org/10.1186/s12859-017-1697-6] [PMID: 28587596]
[77]
Peng, H.; Zheng, Y.; Blumenstein, M.; Tao, D.; Li, J. CRISPR/Cas9 cleavage efficiency regression through boosting algorithms and Markov sequence profiling. Bioinformatics, 2018, 34(18), 3069-3077.
[http://dx.doi.org/10.1093/bioinformatics/bty298] [PMID: 29672669]
[78]
Peng, H.; Zheng, Y.; Zhao, Z.; Liu, T.; Li, J. Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions. Bioinformatics, 2018, 34(17), i757-i765.
[http://dx.doi.org/10.1093/bioinformatics/bty558] [PMID: 30423065]
[79]
Zhang, S.; Li, X.; Lin, Q.; Wong, K-C. Synergizing CRISPR/Cas9 off-target predictions for ensemble insights and practical applications. Bioinformatics, 2019, 35(7), 1108-1115.
[http://dx.doi.org/10.1093/bioinformatics/bty748] [PMID: 30169558]
[80]
Lin, J.; Wong, K.C. Off-target predictions in CRISPR-Cas9 gene editing using deep learning. Bioinformatics, 2018, 34(17), i656-i663.
[http://dx.doi.org/10.1093/bioinformatics/bty554] [PMID: 30423072]
[81]
Kaplan, T.; Friedman, N.; Margalit, H. Ab initio prediction of transcription factor targets using structural knowledge. PLOS Comput. Biol., 2005, 1(1), e1
[http://dx.doi.org/10.1371/journal.pcbi.0010001] [PMID: 16103898]
[82]
Maeder, M.L.; Thibodeau-Beganny, S.; Osiak, A.; Wright, D.A.; Anthony, R.M.; Eichtinger, M.; Jiang, T.; Foley, J.E.; Winfrey, R.J.; Townsend, J.A.; Unger-Wallace, E.; Sander, J.D.; Müller-Lerch, F.; Fu, F.; Pearlberg, J.; Göbel, C.; Dassie, J.P.; Pruett-Miller, S.M.; Porteus, M.H.; Sgroi, D.C.; Iafrate, A.J.; Dobbs, D.; McCray, P.B., Jr; Cathomen, T.; Voytas, D.F.; Joung, J.K. Rapid “open-source” engineering of customized zinc-finger nucleases for highly efficient gene modification. Mol. Cell, 2008, 31(2), 294-301.
[http://dx.doi.org/10.1016/j.molcel.2008.06.016] [PMID: 18657511]
[83]
Ramirez, C.L.; Foley, J.E.; Wright, D.A.; Müller-Lerch, F.; Rahman, S.H.; Cornu, T.I.; Winfrey, R.J.; Sander, J.D.; Fu, F.; Townsend, J.A.; Cathomen, T.; Voytas, D.F.; Joung, J.K. Unexpected failure rates for modular assembly of engineered zinc fingers. Nat. Methods, 2008, 5(5), 374-375.
[http://dx.doi.org/10.1038/nmeth0508-374] [PMID: 18446154]
[84]
Maeder, M.L.; Thibodeau-Beganny, S.; Sander, J.D.; Voytas, D.F.; Joung, J.K. Oligomerized pool engineering (OPEN): An ‘open-source’ protocol for making customized zinc-finger arrays. Nat. Protoc., 2009, 4(10), 1471-1501.
[http://dx.doi.org/10.1038/nprot.2009.98] [PMID: 19798082]
[85]
Perez, E.E.; Wang, J.; Miller, J.C.; Jouvenot, Y.; Kim, K.A.; Liu, O.; Wang, N.; Lee, G.; Bartsevich, V.V.; Lee, Y.L.; Guschin, D.Y.; Rupniewski, I.; Waite, A.J.; Carpenito, C.; Carroll, R.G.; Orange, J.S.; Urnov, F.D.; Rebar, E.J.; Ando, D.; Gregory, P.D.; Riley, J.L.; Holmes, M.C.; June, C.H. Establishment of HIV-1 resistance in CD4+ T cells by genome editing using zinc-finger nucleases. Nat. Biotechnol., 2008, 26(7), 808-816.
[http://dx.doi.org/10.1038/nbt1410] [PMID: 18587387]
[86]
Claverie, J-M.; Audic, S. The statistical significance of nucleotide position-weight matrix matches. Comput. Appl. Biosci., 1996, 12(5), 431-439.
[http://dx.doi.org/10.1093/bioinformatics/12.5.431] [PMID: 8996792]
[87]
Mock, U.; Machowicz, R.; Hauber, I.; Horn, S.; Abramowski, P.; Berdien, B.; Hauber, J.; Fehse, B. mRNA transfection of a novel TAL Effector Nuclease (TALEN) facilitates efficient knockout of HIV co-receptor CCR5. Nucleic Acids Res., 2015, 43(11), 5560-5571.
[http://dx.doi.org/10.1093/nar/gkv469] [PMID: 25964300]
[88]
Grau, J.; Keilwagen, J.; Gohr, A.; Haldemann, B.; Posch, S.; Grosse, I. Jstacs: A java framework for statistical analysis and classification of biological sequences. J. Mach. Learn. Res., 2012, 13, 1967-1971.
[89]
Saa, P.A.; Nielsen, L.K. Construction of feasible and accurate kinetic models of metabolism: A Bayesian approach. Sci. Rep., 2016, 6, 29635.
[http://dx.doi.org/10.1038/srep29635] [PMID: 27417285]
[90]
Murdoch, D.J.; Tsai, Y.L.; Adcock, J. P-values are random variables. Am. Stat., 2008, 62, 242-245.
[http://dx.doi.org/10.1198/000313008X332421]
[91]
Dash, S.; Van Hemert, J.; Hong, L.; Wise, R.P.; Dickerson, J.A. PLEXdb: Gene expression resources for plants and plant pathogens. Nucleic Acids Res., 2012, 40(Database issue), D1194-D1201.
[http://dx.doi.org/10.1093/nar/gkr938] [PMID: 22084198]
[92]
Renaud, G.; Neves, P.; Folador, E.L.; Ferreira, C.G.; Passetti, F. Segtor: Rapid annotation of genomic coordinates and single nucleotide variations using segment trees. PLoS One, 2011, 6(11), e26715
[http://dx.doi.org/10.1371/journal.pone.0026715] [PMID: 22069465]
[93]
Lin, Y.; Fine, E.J.; Zheng, Z.; Antico, C.J.; Voit, R.A.; Porteus, M.H.; Cradick, T.J.; Bao, G. SAPTA: A new design tool for improving TALE nuclease activity. Nucleic Acids Res., 2014, 42(6), e47
[http://dx.doi.org/10.1093/nar/gkt1363] [PMID: 24442582]
[94]
Pattanayak, V.; Ramirez, C.L.; Joung, J.K.; Liu, D.R. Revealing off-target cleavage specificities of zinc-finger nucleases by in vitro selection. Nat. Methods, 2011, 8(9), 765-770.
[http://dx.doi.org/10.1038/nmeth.1670] [PMID: 21822273]
[95]
Streubel, J.; Blücher, C.; Landgraf, A.; Boch, J. TAL effector RVD specificities and efficiencies. Nat. Biotechnol., 2012, 30(7), 593-595.
[http://dx.doi.org/10.1038/nbt.2304] [PMID: 22781676]
[96]
Meckler, J.F.; Bhakta, M.S.; Kim, M.S.; Ovadia, R.; Habrian, C.H.; Zykovich, A.; Yu, A.; Lockwood, S.H.; Morbitzer, R.; Elsäesser, J.; Lahaye, T.; Segal, D.J.; Baldwin, E.P. Quantitative analysis of TALE-DNA interactions suggests polarity effects. Nucleic Acids Res., 2013, 41(7), 4118-4128.
[http://dx.doi.org/10.1093/nar/gkt085] [PMID: 23408851]
[97]
Hockemeyer, D.; Wang, H.; Kiani, S.; Lai, C.S.; Gao, Q.; Cassady, J.P.; Cost, G.J.; Zhang, L.; Santiago, Y.; Miller, J.C.; Zeitler, B.; Cherone, J.M.; Meng, X.; Hinkley, S.J.; Rebar, E.J.; Gregory, P.D.; Urnov, F.D.; Jaenisch, R. Genetic engineering of human pluripotent cells using TALE nucleases. Nat. Biotechnol., 2011, 29(8), 731-734.
[http://dx.doi.org/10.1038/nbt.1927] [PMID: 21738127]
[98]
Tesson, L.; Usal, C.; Ménoret, S.; Leung, E.; Niles, B.J.; Remy, S.; Santiago, Y.; Vincent, A.I.; Meng, X.; Zhang, L.; Gregory, P.D.; Anegon, I.; Cost, G.J. Knockout rats generated by embryo microinjection of TALENs. Nat. Biotechnol., 2011, 29(8), 695-696.
[http://dx.doi.org/10.1038/nbt.1940] [PMID: 21822240]
[99]
Sherry, S.T.; Ward, M.H.; Kholodov, M.; Baker, J.; Phan, L.; Smigielski, E.M.; Sirotkin, K. dbSNP: The NCBI database of genetic variation. Nucleic Acids Res., 2001, 29(1), 308-311.
[http://dx.doi.org/10.1093/nar/29.1.308] [PMID: 11125122]
[100]
Li, H. Tabix: Fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics, 2011, 27(5), 718-719.
[http://dx.doi.org/10.1093/bioinformatics/btq671] [PMID: 21208982]
[101]
Lorenz, R.; Bernhart, S.H.; Höner Zu Siederdissen, C.; Tafer, H.; Flamm, C.; Stadler, P.F.; Hofacker, I.L. ViennaRNA Package 2.0. Algorithms Mol. Biol., 2011, 6, 26.
[http://dx.doi.org/10.1186/1748-7188-6-26] [PMID: 22115189]
[102]
Mali, P.; Yang, L.; Esvelt, K.M.; Aach, J.; Guell, M.; DiCarlo, J.E.; Norville, J.E.; Church, G.M. RNA-guided human genome engineering via Cas9. Science, 2013, 339(6121), 823-826.
[http://dx.doi.org/10.1126/science.1232033] [PMID: 23287722]
[103]
Cong, L.; Ran, F.A.; Cox, D.; Lin, S.; Barretto, R.; Habib, N.; Hsu, P.D.; Wu, X.; Jiang, W.; Marraffini, L.A.; Zhang, F. Multiplex genome engineering using CRISPR/Cas systems. Science, 2013, 339(6121), 819-823.
[http://dx.doi.org/10.1126/science.1231143] [PMID: 23287718]
[104]
Hou, Z.; Zhang, Y.; Propson, N.E.; Howden, S.E.; Chu, L.F.; Sontheimer, E.J.; Thomson, J.A. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc. Natl. Acad. Sci. USA, 2013, 110(39), 15644-15649.
[http://dx.doi.org/10.1073/pnas.1313587110] [PMID: 23940360]
[105]
Schölkopf, B.; Burges, J.; Smola, A. Advances in kernel methods: support vector machine; MIT Press: Cambridge, MA, 1999.
[106]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: machine learning in python. J. Mach. Learn. Res., 2011, 12, 2825-2830.
[107]
Frock, R.L.; Hu, J.; Meyers, R.M.; Ho, Y.J.; Kii, E.; Alt, F.W. Genome-wide detection of DNA double-stranded breaks induced by engineered nucleases. Nat. Biotechnol., 2015, 33(2), 179-186.
[http://dx.doi.org/10.1038/nbt.3101] [PMID: 25503383]
[108]
Ran, F.A.; Cong, L.; Yan, W.X.; Scott, D.A.; Gootenberg, J.S.; Kriz, A.J.; Zetsche, B.; Shalem, O.; Wu, X.; Makarova, K.S.; Koonin, E.V.; Sharp, P.A.; Zhang, F. In vivo genome editing using Staphylococcus aureus Cas9. Nature, 2015, 520(7546), 186-191.
[http://dx.doi.org/10.1038/nature14299] [PMID: 25830891]
[109]
Tsai, S.Q.; Zheng, Z.; Nguyen, N.T.; Liebers, M.; Topkar, V.V.; Thapar, V.; Wyvekens, N.; Khayter, C.; Iafrate, A.J.; Le, L.P.; Aryee, M.J.; Joung, J.K. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol., 2015, 33(2), 187-197.
[http://dx.doi.org/10.1038/nbt.3117] [PMID: 25513782]
[110]
Kleinstiver, B.P.; Pattanayak, V.; Prew, M.S.; Tsai, S.Q.; Nguyen, N.T.; Zheng, Z.; Joung, J.K. High-fidelity CRISPR-Cas9 nucleases with no detectable genome-wide off-target effects. Nature, 2016, 529(7587), 490-495.
[http://dx.doi.org/10.1038/nature16526] [PMID: 26735016]
[111]
Slaymaker, I.M.; Gao, L.; Zetsche, B.; Scott, D.A.; Yan, W.X.; Zhang, F. Rationally engineered Cas9 nucleases with improved specificity. Science, 2016, 351(6268), 84-88.
[http://dx.doi.org/10.1126/science.aad5227] [PMID: 26628643]
[112]
Stemmer, M.; Thumberger, T.; Del Sol Keyer, M.; Wittbrodt, J.; Mateo, J.L. CCTop: An intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS One, 2015, 10(4), e0124633
[http://dx.doi.org/10.1371/journal.pone.0124633] [PMID: 25909470]
[113]
Doench, J.G.; Fusi, N.; Sullender, M.; Hegde, M.; Vaimberg, E.W.; Donovan, K.F.; Smith, I.; Tothova, Z.; Wilen, C.; Orchard, R.; Virgin, H.W.; Listgarten, J.; Root, D.E. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol., 2016, 34(2), 184-191.
[http://dx.doi.org/10.1038/nbt.3437] [PMID: 26780180]
[114]
Koike-Yusa, H.; Li, Y.; Tan, E.P. Velasco-Herrera, Mdel.C.; Yusa, K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat. Biotechnol., 2014, 32(3), 267-273.
[http://dx.doi.org/10.1038/nbt.2800] [PMID: 24535568]
[115]
Wang, T.; Wei, J.J.; Sabatini, D.M.; Lander, E.S. Genetic screens in human cells using the CRISPR-Cas9 system. Science, 2014, 343(6166), 80-84.
[http://dx.doi.org/10.1126/science.1246981] [PMID: 24336569]
[116]
DeLong, E.R.; DeLong, D.M.; Clarke-Pearson, D.L. Comparing the areas under two or more correlated receiver operating characteristic curves: A nonparametric approach. Biometrics, 1988, 44(3), 837-845.
[http://dx.doi.org/10.2307/2531595] [PMID: 3203132]
[117]
Eddy, S.R. Profile hidden Markov models. Bioinformatics, 1998, 14(9), 755-763.
[http://dx.doi.org/10.1093/bioinformatics/14.9.755] [PMID: 9918945]
[118]
Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM: New York, NY, USA2016, pp. 785-794.
[http://dx.doi.org/10.1145/2939672.2939785]
[119]
Mao, K.Z. Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans. Syst. Man Cybern. B Cybern., 2004, 34(1), 629-634.
[http://dx.doi.org/10.1109/TSMCB.2002.804363] [PMID: 15369099]
[120]
Xu, H.; Xiao, T.; Chen, C.H.; Li, W.; Meyer, C.A.; Wu, Q.; Wu, D.; Cong, L.; Zhang, F.; Liu, J.S.; Brown, M.; Liu, X.S. Sequence determinants of improved CRISPR sgRNA design. Genome Res., 2015, 25(8), 1147-1157.
[http://dx.doi.org/10.1101/gr.191452.115] [PMID: 26063738]
[121]
Hsu, P.D.; Scott, D.A.; Weinstein, J.A.; Ran, F.A.; Konermann, S.; Agarwala, V.; Li, Y.; Fine, E.J.; Wu, X.; Shalem, O.; Cradick, T.J.; Marraffini, L.A.; Bao, G.; Zhang, F. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat. Biotechnol., 2013, 31(9), 827-832.
[http://dx.doi.org/10.1038/nbt.2647] [PMID: 23873081]
[122]
Singh, R.; Kuscu, C.; Quinlan, A.; Qi, Y.; Adli, M. Cas9-chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res., 2015, 43(18), e118
[http://dx.doi.org/10.1093/nar/gkv575] [PMID: 26032770]
[123]
Haeussler, M.; Schönig, K.; Eckert, H.; Eschstruth, A.; Mianné, J.; Renaud, J.B.; Schneider-Maunoury, S.; Shkumatava, A.; Teboul, L.; Kent, J.; Joly, J.S.; Concordet, J.P. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol., 2016, 17(1), 148.
[http://dx.doi.org/10.1186/s13059-016-1012-2] [PMID: 27380939]
[124]
Pollard, K.S.; Hubisz, M.J.; Rosenbloom, K.R.; Siepel, A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res., 2010, 20(1), 110-121.
[http://dx.doi.org/10.1101/gr.097857.109] [PMID: 19858363]
[125]
Siepel, A.; Bejerano, G.; Pedersen, J.S.; Hinrichs, A.S.; Hou, M.; Rosenbloom, K.; Clawson, H.; Spieth, J.; Hillier, L.W.; Richards, S.; Weinstock, G.M.; Wilson, R.K.; Gibbs, R.A.; Kent, W.J.; Miller, W.; Haussler, D. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res., 2005, 15(8), 1034-1050.
[http://dx.doi.org/10.1101/gr.3715005] [PMID: 16024819]
[126]
Ernst, J.; Kellis, M. ChromHMM: Automating chromatin-state discovery and characterization. Nat. Methods, 2012, 9(3), 215-216.
[http://dx.doi.org/10.1038/nmeth.1906] [PMID: 22373907]
[127]
Ernst, J.; Kellis, M. Chromatin-state discovery and genome annotation with ChromHMM. Nat. Protoc., 2017, 12(12), 2478-2492.
[http://dx.doi.org/10.1038/nprot.2017.124] [PMID: 29120462]
[128]
Hoffman, M.M.; Buske, O.J.; Wang, J.; Weng, Z.; Bilmes, J.A.; Noble, W.S. Unsupervised pattern discovery in human chromatin structure through genomic segmentation. Nat. Methods, 2012, 9(5), 473-476.
[http://dx.doi.org/10.1038/nmeth.1937] [PMID: 22426492]
[129]
Zerovnik, J. The cross entropy method: A unified approach to combinatorial optimization, Monte-Carlo simulation and machine learning. J. Oper. Res. Soc., 2006, 57, 1503-1503.


Rights & PermissionsPrintExport Cite as

Article Details

VOLUME: 27
ISSUE: 5
Year: 2020
Page: [370 - 384]
Pages: 15
DOI: 10.2174/0929866526666191028162302
Price: $65

Article Metrics

PDF: 9
HTML: 1