Input: DS (the mixed-attribute dataset) |
Output: CL (cluster label vector) |
(1) | //Step 1. Load the dataset DS and separate it into numerical subset Dr, categorical subset Dc and the ture label subset. |
(2) | [Dr,Dc,truelabel] = loadseparate(DS); |
(3) | //Step 2. Calculate the distance and construct the distance matrix of the mixed-attribute dataset DS according to equation (8). |
| distmatrix = distamdpc(Dr,Dc); |
(4) | //Step 3. Calculate the local KNN density ρi of each data point according to equation (4) and calculate the relative distance δi according to equation (2). |
(5) | rho = kNNrho(distmatrix); |
(6) | delta = calcdelta(distmatrix); |
(7) | //Step 4. Run Algorithm 1 to obtain the cluster center points and set each point a different label. |
(8) | Sc = findClusterCenter(rho,delta); |
(9) | //Step 5. Assign the class label for center and non-center points using original DPC method according to the Sc. |
(10) | //Step 5.1. Initialize the class label vector CL. |
(11) | NCLUST = 0; |
(12) | for i = 1 to number of datapoints |
(13) | CL(i) = −1; |
(14) | End |
(15) | //Step 5.2. Assign the class label for center points |
(16) | for j = 1 to sizeof(Sc) |
(17) | NCLUST = NCLUST + 1; |
(18) | CL(Sc(j)) = NCLUST; |
(19) | End |
(20) | //Step 5.3. Assign the class label for non-center points |
(21) | for k = 1 to number of datapoints |
(22) | if (CL(ordrho(k)) = = −1) |
(23) | CL(ordrho(k)) = CL(nneigh(ordrho(k))); //assign the non-center data points to the cluster with the nearest local density that is higher than its own. |
(24) | end |