Introduction: Hydroxylation is one of the most important post-translational modifications
(PTM) in cellular functions and is linked to various diseases. The addition of one of the hydroxyl
groups (OH) to the lysine sites produces hydroxylysine when undergoes chemical modification.
Methods: The method which is used in this study for identifying hydroxylysine sites based on
powerful mathematical and statistical methodology incorporating the sequence-order effect and
composition of each object within protein sequences. This predictor is called "iHyd-LysSite (EPSV)"
(identifying hydroxylysine sites by extracting enhanced position and sequence variant technique). The
prediction of hydroxylysine sites by experimental methods is difficult, laborious and highly expensive.
In silico technique is an alternative approach to identify hydroxylysine sites in proteins.
Results: The experimental results require that the predictive model should have high sensitivity and
specificity values and must be more accurate. The self-consistency, independent, 10-fold crossvalidation
and jackknife tests are performed for validation purposes. These tests are resulted by using
three renowned classifiers, Neural Networks (NN), Random Forest (RF) and Support Vector Machine
(SVM) with the demanding prediction rate. The overall predictive outcomes are extraordinarily
superior to the results obtained by previous predictors. The proposed model contributed an excellent
prediction rate in the system for NN, RF, and SVM classifiers. The sensitivity and specificity results
using all these classifiers for jackknife test are 96.08%, 94.99%, 98.16% and 97.52%, 98.52%,
Conclusion: The results obtained by the proposed tool show that this method may meet the future
demand of hydroxylysine sites with a better prediction rate over the existing methods.