Background: Cancer threatens human health seriously. Diagnosing cancer via gene expression analysis is the hot topic in cancer research.
Objective: To diagnose the accurate type of lung cancer and discover the pathogenic genes.
Method: In this study, affinity propagation (AP) clustering with similarity score is employed to each type of lung cancer and normal lung. After grouping genes, sparse group lasso is adopted to construct four binary classifiers and the voting strategy is used to integrate them.
Results: This study screens six gene groups that may associate with diffierent lung cancer subtypes among 73 genes groups, and identifies three possible key pathogenic genes, KRAS, BRAF and VDR. Furthermore, this study achieves improved classification accuracies at minority classes SQ and COID in comparison with other four methods.
Conclusion: We propose the AP clustering based sparse group lasso (AP-SGL), which provides an alternative for simultaneous diagnosis and gene selection for lung cancer.