Background: Cytokines, as small signaling proteins, play critical roles in biological functions
and are closely related with human diseases. Accurate identification of cytokines is the first step
to provide insights into the relevance of cytokines and human diseases. In recent years, many research
efforts have been done for the development of computational methods, especially for machine learning
based methods, to fast and accurately identify cytokines. Currently, a major challenge lying in existing
machine learning based methods is to improve the performance of cytokine identification.
Method: In this study, we attempt to enhance the performance of cytokine identification methods from
the two following factors: (1) feature representation and (2) classifier selection. For feature extraction,
we fuse multiple types of features showing good performance to classify cytokines from noncytokines,
and employ two feature selection techniques, Max-Relevance-Max-Distance (MRMD) and
Principal Components Analysis (PCA), to yield the optimal feature representations. For classifier selection,
various powerful classifiers are performed, and the one with the highest performance is determined
to build the classification model for our method.
Results: Based on the analysis, we learned that our feature sets stably maintain high performance with
any of the classifier we used. And, the overall performances of the combinations were in the following
order from best to worst: 473D+LIBSVM, MRMD+LIBD3C, and PCA+LIBSVM.
Conclusion: Comparative studies demonstrate that our proposed strategy is effective for the improved
performance in identification of cytokines.