Objective: Cancer is one of the most serious diseases affecting human health. Among all current
cancer treatments, early diagnosis and control significantly help increase the chances of cure. Detecting
cancer biomarkers in body fluids now is attracting more attention within oncologists. In-silico
predictions of body fluid-related proteins, which can be served as cancer biomarkers, open a door for
labor-intensive and time-consuming biochemical experiments.
Methods: In this work, we propose a novel method for high-throughput identification of cancer biomarkers
in human body fluids. We incorporate physicochemical properties into the weighted observed
percentages (WOP) and position-specific scoring matrices (PSSM) profiles to enhance their attributes
that reflect the evolutionary conservation of the body fluid-related proteins. The least absolute selection
and shrinkage operator (LASSO) feature selection strategy is introduced to generate the optimal feature
Results: The ten-fold cross-validation results on training datasets demonstrate the accuracy of the proposed
model. We also test our proposed method on independent testing datasets and apply it to the identification
of potential cancer biomarkers in human body fluids.
Conclusion: The testing results promise a good generalization capability of our approach.