Generic placeholder image

Recent Advances in Computer Science and Communications


ISSN (Print): 2666-2558
ISSN (Online): 2666-2566

General Research Article

The Kernel Rough K-Means Algorithm

Author(s): Wang Meng*, Dui Hongyan, Zhou Shiyuan, Dong Zhankui and Wu Zige

Volume 13, Issue 2, 2020

Page: [234 - 239] Pages: 6

DOI: 10.2174/2213275912666190716121431

Price: $65


Background: Clustering is one of the most important data mining methods. The k-means (c-means ) and its derivative methods are the hotspot in the field of clustering research in recent years. The clustering method can be divided into two categories according to the uncertainty, which are hard clustering and soft clustering. The Hard C-Means clustering (HCM) belongs to hard clustering while the Fuzzy C-Means clustering (FCM) belongs to soft clustering in the field of k-means clustering research respectively. The linearly separable problem is a big challenge to clustering and classification algorithm and further improvement is required in big data era.

Objective: RKM algorithm based on fuzzy roughness is also a hot topic in current research. The rough set theory and the fuzzy theory are powerful tools for depicting uncertainty, which are the same in essence. Therefore, RKM can be kernelized by the mean of KFCM. In this paper, we put forward a Kernel Rough K-Means algorithm (KRKM) for RKM to solve nonlinear problem for RKM. KRKM expanded the ability of processing complex data of RKM and solve the problem of the soft clustering uncertainty.

Methods: This paper proposed the process of the Kernel Rough K-Means algorithm (KRKM). Then the clustering accuracy was contrasted by utilizing the data sets from UCI repository. The experiment results shown the KRKM with improved clustering accuracy, comparing with the RKM algorithm.

Results: The classification precision of KFCM and KRKM were improved. For the classification precision, KRKM was slightly higher than KFCM, indicating that KRKM was also an attractive alternative clustering algorithm and had good clustering effect when dealing with nonlinear clustering.

Conclusion: Through the comparison with the precision of KFCM algorithm, it was found that KRKM had slight advantages in clustering accuracy. KRKM was one of the effective clustering algorithms that can be selected in nonlinear clustering.

Keywords: K-Means, kernel function, rough set, clustering, big data, KRKM.

Graphical Abstract
G.A. Wilkin, "K-Means clustering algorithms: Implementation and comparison", In: Second International Multi-Symposiums on Computer and Computational Sciences (IMSCCS 2007). 133-136.IEEE, 2007
R. Moore, and W. Lodwick, "Interval analysis and fuzzy set theory", Fuzzy Sets Syst., vol. 135, no. 1, pp. 5-9, 2003.
Z. Pawlak, "Rough sets", Intl. J. Comp. Info. Sci., vol. 11, no. 5, pp. 341-356, 1982.
P. Lingras, and C. West, "Interval set clustering of web users with rough K-means", J. Intell. Inf. Syst., vol. 23, no. 1, pp. 5-16, 2004.
T. Zhang, and F. Ma, "Improved rough k-means clustering algorithm based on weighted distance measure with Gaussian function", Int. J. Comput. Math., vol. 94, no. 4, p. 13, 2017.
V. Reddy, C. Sanderson, and B.C. Lovell, "A low-complexity algorithm for static background estimation from cluttered image sequences in surveillance contexts", EURASIP J. Image Video Process., vol. 2011, no. 1, pp. 1-14, 2011.
J. Shi, Y. Lei, and Y. Zhou, "Enhanced rough-fuzzy c-means algorithm with strict rough sets properties", Appl. Soft Comput., vol. 46, pp. 827-850, 2016.
B. Schölkopf, and A.J. Smola, Learning with kernels: Support vector machines, regularization, optimization, and beyond.. Cambridge:MIT Press, 2003.
V.N. Vapnik, "Statistical learning theory", Ann. Inst. Stat. Math., vol. 55, no. 2, pp. 371-389, 2003.
B. Schölkopf, A. Smola, and K. Müller, "Nonlinear component analysis as a kernel eigenvalue problem", Neural Comput., vol. 10, no. 5, pp. 1299-1319, 2008.
X. Yang, G. Zhang, and J. Lu, "A kernel fuzzy c-means clustering-based fuzzy support vector machine algorithm for classification problems with outliers or noises", IEEE Trans. Fuzzy Syst., vol. 19, no. 1, pp. 105-115, 2011.
Y. Ding, and X. Fu, "Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm", Neurocomputing, vol. 188, pp. 233-238, 2015.
D.M. Tsai, and C.C. Lin, "Fuzzy C-means based clustering for linearly and nonlinearly separable data", Pattern Recognit., vol. 44, no. 8, pp. 1750-1760, 2011.
D. Graves, and W. Pedrycz, "Kernel-based fuzzy clustering and fuzzy clustering: A comparative experimental study", Fuzzy Sets Syst., vol. 161, no. 4, pp. 522-543, 2010.
A.E. Xavier, and V.L. Xavier, Hyperbolic smoothing clustering and minimum distance methods. U.S. Patent: 13,513,296, issued September 27, 2012.
V. Gangavaram, and D. Hedge, Search term clustering.. U.S.Patent 10,198,497, issued February 5, 2019
S.M. Christian, K.F. Taylor, and D.J. Erdman, Reducing data storage, memory, and computational time needed for ad-hoc data analysis.. U.S. Patent 10,198,532, issued February 5, 2019.
H. Garg, "A hybrid PSO-GA algorithm for constrained optimization problems", Appl. Math. Comput., vol. 274, pp. 292-305, 2016.
H. Garg, "A hybrid GA-GSA algorithm for optimizing the performance of an industrial system by utilizing uncertain data", In: Handbook of Research on Artificial Intelligence Techniques and Algorithms.. pp. 620-654. IGI Global, 2015.
H. Garg, "A hybrid PSO-GA algorithm for constrained optimization problems", Appl. Math. Comput., vol. 274, pp. 292-305, 2016.
J.A.H.A. Wong, "Algorithm AS 136: A K-Means Clustering Algorithm", J. R. Stat. Soc. Ser. C Appl. Stat., vol. 28, no. 1, pp. 100-108, 1979.

Rights & Permissions Print Cite
© 2024 Bentham Science Publishers | Privacy Policy