Research Article
Clustering Categorical Data Using Community Detection Techniques
| Input: dataset with data points, each data point has | | attributes. The number of clusters . | | Output: clusters of data points . | | compute pairwise Hamming distances | | compute CDF for pairwise Hamming distances | | // estimate distance threshold | | for do | | (5) if and then | | (6)break | | (7) // clustering | | (8) | | (9)for do | | (10) if then | | (11) | | (12) run Louvain method [12] on | | (13) keep top- communities by size | | (14)for each cluster do | | (15) compute the mode of | | (16)for each remaining data point do | | (17) assign to the nearest mode , | | (18)return |
|