Research Article

COIN: Correlation Index-Based Similarity Measure for Clustering Categorical Data

Table 1

Some recently published literature in cluster algorithms.

YearContributionAuthor (s)

2015Strength and weakness of clustering algorithms were discussed.Xu and Tian [21]
2016Unsupervised clustering methodology for analyzing large datasets was proposed.Naeni et al. [22]
They converted the general clustering problem into the community detection problem in graph by using the Jensen-Shannon distance. This dissimilarity measure is a graph theoretic concept for the generation and analysis of proximity graphs.
2016Various clustering algorithms applicable to the gene expression data were reviewed. Clustering algorithms are good method to understand natural structure inherent in gene expression data, gene functions, cellular processes, and subtypes of cells, etc.Oyelade et al. [23]
2017A new cluster algorithm to process data streams having correlated components. They estimated covariance matrices via an optimal double shrinkage method, which provided positive definite estimates even in presence of a few data points or small variance.Aletti and Micheletti [24]
2017They used novel niche genetic algorithm (NGA) with density and noise for K-means clustering and named it as NoiseClust.Zhou et al. [25]
2019Assuming normally distributed data, nine well-known clustering methods available in the R language were compared and reviewed.Rodriguez et al. [26]
2019They proposed routing protocol that used k-means clustering algorithm to make next hop selection decisions.Sharma et al. [27]
2019They used multifeature trajectory similarity measure in trajectory clustering, which can maximize the similarity of trajectories in the same cluster.Yu et al. [28]
2020They proposed unsupervised k-means (U-k-means) clustering algorithm which does not use any initialization and parameter selection but can automatically find an optimal number of clusters.Sinaga and Yang [29]
2021K-means clustering algorithm was applied to partitioning scholarship recipient data into clusters.Nur Khomarudin et al. [30]
2021Applied K-means algorithm into five-taxi GPS data to identify the hot spots and improve urban congestion.Ran et al. [31]
2021Developed a density-based approach to detect moving object clusters.Shia et al. [32]
2021They reported that K-means clustering can effectively avoid the subjective negative impact caused by artificial division thresholds. It can optimize the prediction financial risk and redistribute target dataset to each cluster center for obtaining optimized solution.Zhu and Liu [33]
2021They developed k-nearest neighbor-based Internet of vehicles which can optimize the traffic.Wang et al. [34]