The catalytic activity of the enzyme is different from that of the inorganic catalyst. In a
high-temperature, over-acid or over-alkaline environment, the structure of the enzyme is destroyed and
then loses its activity. Although the biochemistry experiments can measure the optimal PH environment
of the enzyme, these methods are inefficient and costly. In order to solve these problems,
computational model could be established to determine the optimal acidic or alkaline environment of
the enzyme. Firstly, in this paper, we introduced a new feature called dual g-gap dipeptide composition
to formulate enzyme samples. Subsequently, the best feature was selected by using the F value calculated
from analysis of variance. Finally, support vector machine was utilized to build prediction model
for distinguishing acidic from alkaline enzyme. The overall accuracy of 95.9% was achieved with
Jackknife cross-validation, which indicates that our method is professional and efficient in terms of
acid and alkaline enzyme predictions. The feature proposed in this paper could also be applied in other
fields of bioinformatics.
Keywords: Acidic enzyme, alkaline enzyme, support vector machine, dipeptide composition, feature selection, crossvalidation.
Rights & PermissionsPrintExport