Research Article

Adaptive Mixed-Attribute Data Clustering Method Based on Density Peaks

Algorithm 2

AMDPC_clustering.
   Input: DS (the mixed-attribute dataset)
   Output: CL (cluster label vector)
(1)//Step 1. Load the dataset DS and separate it into numerical subset Dr, categorical subset Dc and the ture label subset.
(2)[Dr,Dc,truelabel] = loadseparate(DS);
(3)//Step 2. Calculate the distance and construct the distance matrix of the mixed-attribute dataset DS according to equation (8).
distmatrix = distamdpc(Dr,Dc);
(4)//Step 3. Calculate the local KNN density ρi of each data point according to equation (4) and calculate the relative distance δi according to equation (2).
(5)rho = kNNrho(distmatrix);
(6)delta = calcdelta(distmatrix);
(7)//Step 4. Run Algorithm 1 to obtain the cluster center points and set each point a different label.
(8)Sc = findClusterCenter(rho,delta);
(9)//Step 5. Assign the class label for center and non-center points using original DPC method according to the Sc.
(10)//Step 5.1. Initialize the class label vector CL.
(11)NCLUST = 0;
(12)for i = 1 to number of datapoints
(13)  CL(i) = −1;
(14)End
(15)//Step 5.2. Assign the class label for center points
(16)for j = 1 to sizeof(Sc)
(17)  NCLUST = NCLUST + 1;
(18)  CL(Sc(j)) = NCLUST;
(19)End
(20)//Step 5.3. Assign the class label for non-center points
(21)for k = 1 to number of datapoints
(22)  if (CL(ordrho(k)) = = −1)
(23)    CL(ordrho(k)) = CL(nneigh(ordrho(k))); //assign the non-center data points to the cluster with the nearest local density that is higher than its own.
(24)end