Codon usage, codon context, rare codons, nucleotide repetition and mRNA destabilizing sequences are but a few of the many factors that influence the efficiency of protein synthesis. Therefore, gene redesign for heterologous expression is a multi-objective optimization problem and the factors that need to be considered are often conflicting. Evolutionary approaches have already been shown to be able to evolve a sequence under the forces of specific constraints. However, it is unclear what are the advantages of a slower algorithm such as GA when compared with other faster algorithms in the gene redesign context.
Here, a solution using genetic algorithms along with a Pareto archive is used for the gene synthetic redesign problem. The different redesign parameters are merged using an adapted genetic algorithm strategy. From the created model, the best possible synonymous gene sequence is generated. This allows tackling the gene redesign problem by exploring the large search space of possible synonymous sequences. It is then shown that genetic algorithms have several advantages over other heuristics in the gene redesign problem. For instance, the ability to return the best solutions constituting the main part of the Pareto front, even in non-convex or non-continuous spaces. This allows a researcher to select synonymous genes among the optimal solutions, to best suit his purpose, instead of accepting a single solution that might represent an unwanted trade-off between the objectives.
Keywords: Genetic algorithms, multi-objective optimization, pareto front, simulated annealing, Synthetic gene design, Codon, optimal solutions, heterologous expression, G+C, GENETIC ALGORITHM, GC content.