Background: Neuropeptides are a class of bioactive peptides produced from
neuropeptide precursors through a series of extremely complex processes, mediating neuronal
regulations in many aspects. Accurate identification of cleavage sites of neuropeptide precursors is
of great significance for the development of neuroscience and brain science.
Objective: With the explosive growth of neuropeptide precursor data, it is pretty much needed to
develop bioinformatics methods for predicting neuropeptide precursors’ cleavage sites quickly and
Methods: We started with processing the neuropeptide precursor data from SwissProt and
NueoPedia into two sets of data, training dataset and testing dataset. Subsequently, six feature
extraction schemes were applied to generate different feature sets and then feature selection
methods were used to find the optimal feature subset of each. Thereafter the support vector machine
was utilized to build models for different feature types. Finally, the performance of models were
evaluated with the independent testing dataset.
Results: Six models are built through support vector machine. Among them the enhanced amino
acid composition-based model reaches the highest accuracy of 91.60% in the 5-fold cross
validation. When evaluated with independent testing dataset, it also showed an excellent
performance with a high accuracy of 90.37% and Area under Receiver Operating Characteristic
curve up to 0.9576.
Conclusion: The performance of the developed model was decent. Moreover, for users’
convenience, an online web server called NeuroCS is built, which is freely available at
http://i.uestc.edu.cn/NeuroCS/dist/index.html#/. NeuroCS can be used to predict neuropeptide
precursors’ cleavage sites effectively.