Background: The modern society is extremely prone to many life-threatening diseases, which
can be easily controlled as well as cured if diagnosed at an early stage. The development and implementation
of a disease diagnostic system have gained huge popularity over the years. In the current scenario,
there are certain factors such as environment, sedentary lifestyle, genetic (hereditary) are the major factors
behind the life threatening diseases such as ‘diabetes.’ Moreover, diabetes has achieved the status of
the modern man’s leading chronic disease. So one of the prime needs of this generation is to develop a
state-of-the-art expert system which can predict diabetes at a very early stage with a minimum of complexity
and in an expedited manner. The primary objective of this work is to develop an indigenous and
efficient diagnostic technique for detection of diabetes.
Method & Discussion: The proposed methodology comprises of two phases: In the first phase The Pima
Indian Diabetes Dataset (PIDD) has been collected from the UCI machine learning repository databases
and Localized Diabetes Dataset (LDD) has been gathered from Bombay Medical Hall, Upper Bazar
Ranchi, Jharkhand, India. In the second phase, the dataset has been processed through two different approaches.
The first approach entails classification through Adaboost, Classification via Regression
(CVR), Radial Basis Function Network (RBFN), K-Nearest Neighbor (KNN) on Pima Indian Diabetes
Dataset and Localized Diabetes Dataset. In the second approach, Principal Component Analysis (PCA)
and Linear Discriminant Analysis (LDA) have been applied as a feature reduction method followed by
using the same set of classification methods used in the first approach. Among all of the implemented
classification methods, PCA_CVR achieves the maximum performance for both the above mentioned
Conclusion: In this article, comparative analysis of outcomes obtained by with and without the use of
PCA and LDA for the same set of classification method has been done w.r.t performance assessment.
Finally, it has been concluded that PCA & LDA both are useful to remove the insignificant features,
decreasing the expense and computation time while improving the ROC and accuracy. The used methodology
may similarly be applied to other medical diseases.