Of the 20 common amino acids, 18 are encoded by multiple synonymous codons. Those synonymous codons are not redundant, and all of them contribute to protein expression, structure and function. Therefore, it is useful to know the rules about the synonymous codon selection in a species and design the heterogenous gene with efficient expression in the host.
In this study, a machine-learning method, namely Presyncodon was proposed to predict the synonymous codon selection in E.coli and design the two reporter genes (egfp and mApple) with the appropriate synonymous codons, based on codon usage pattern of the residue in the specific fragment. The results indicate that the method could be used to predict synonymous codon selection in the gene, and the designed two reporter genes containing the low- or high-frequency-usage codons were more efficiently expressed in E.coli than the genes with only the high-frequency-usage codons. Therefore, both the low- and high-frequency-usage codons make positive contributions to the functional expression of the heterologous proteins. This method could be used to design of synthetic genes for heterologous gene expression in biotechnology.
Jian Tian, Yaru Yan, Qingxia Yue, Xiaoqing Liu, Xiaoyu Chu, Ningfeng Wu, Yunliu Fan.Predicting the synonymous codon selection and optimizing the heterologous gene for expression in E.coli. Scientific Reports.2017