Data mining techniques are applied in bioinformatics to analyze biomedical data. When the number of features related to the data is irrelevant, the classifiers will produce unsatisfactory results. This paper addresses the need to analyze the data for extracting relevant features. A number of feature selection algorithms are developed in the field of medical data to address feature selection. In this paper, an intelligent hybrid optimal feature selection algorithm is proposed for Type-2 diabetes with improved classification accuracy. This work proposes an intelligent Hybrid Binary Cuckoo Search (CS) and Genetic Algorithm (GA) for selecting the important features of Type-2 diabetes. In HBCS-GA, exploration and exploitation of CS is improved using genetic operators to select relevant features with better accuracy. To validate the model, a 10-fold cross-validation strategy is used. The proposed algorithm produces 99.31% accuracy to diagnose the disease. The performance of HBCS-GAis also compared with other approaches. Also, the model validation with reduced features is performed with Decision Tree(DT), Bayesian Network(BN), Support Vector Machine (SVM) and k-Nearest Neighbor (k-NN) classifiers. The accuracies obtained are 94.46%, 96.07%, 98.84%, 96.79% respectively. The results also showed that HBCS-GA achieved high classification accuracy than the other approaches.
Keywords: Diabetes, normalization, classification, instance selection, wrapper feature selection, cuckoo search, genetic algorithm.
Rights & PermissionsPrintExport