Amrita Bhattacherjee1* and Sugata Sanyal2
1Department of Statistics, St. Xavier’s College, Kolkata, India
2Professor (Retired), School of Technology and Computer Science, Tata Institute of Fundamental Research, Mumbai, India
*Corresponding Author: Amrita Bhattacherjee, Department of Statistics, St. Xavier’s College, Kolkata, India.
Received: November 30, 2021; Published: January 18, 2022
The Fuzzy C-Means clustering technique is one of the most popular soft clustering algorithms in the field of data segmentation. However, its high time complexity makes it computationally expensive, when implemented on very large datasets. Kolen and Hutcheson (2002) [1] proposed a modification of the FCM Algorithm, which dramatically reduces the runtime of their algorithm, making it linear with respect to the number of clusters, as opposed to the original algorithm which was quadratic with respect to the number of clusters. This paper proposes further modification of the algorithm by Kolen., et al. by suggesting effective seed initialisation (by Fuzzy C-Means++, proposed by Stetco., et al. [2]) before feeding the initial cluster centers to the algorithm. The resultant model converges even faster. Empirical findings are illustrated using two synthetic and two real-world datasets.
Keywords: Clustering; Fuzzy Partitions; Time Complexity; Fuzzy C-Means Algorithm; Unsupervised Machine Learning
Citation: Amrita Bhattacherjee and Sugata Sanyal. “Reducing Time Complexity of the Fuzzy C-Means Algorithm: Case Studies". Acta Scientific Computer Sciences 3.2 (2022): 23-33.
Copyright: © 2022 Amrita Bhattacherjee and Sugata Sanyal. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.