| Input: DS (the mixed-attribute dataset) |
| Output: CL (cluster label vector) |
| (1) | //Step 1. Load the dataset DS and separate it into numerical subset Dr, categorical subset Dc and the ture label subset. |
| (2) | [Dr,Dc,truelabel] = loadseparate(DS); |
| (3) | //Step 2. Calculate the distance and construct the distance matrix of the mixed-attribute dataset DS according to equation (8). |
| | distmatrix = distamdpc(Dr,Dc); |
| (4) | //Step 3. Calculate the local KNN density ρi of each data point according to equation (4) and calculate the relative distance δi according to equation (2). |
| (5) | rho = kNNrho(distmatrix); |
| (6) | delta = calcdelta(distmatrix); |
| (7) | //Step 4. Run Algorithm 1 to obtain the cluster center points and set each point a different label. |
| (8) | Sc = findClusterCenter(rho,delta); |
| (9) | //Step 5. Assign the class label for center and non-center points using original DPC method according to the Sc. |
| (10) | //Step 5.1. Initialize the class label vector CL. |
| (11) | NCLUST = 0; |
| (12) | for i = 1 to number of datapoints |
| (13) | CL(i) = −1; |
| (14) | End |
| (15) | //Step 5.2. Assign the class label for center points |
| (16) | for j = 1 to sizeof(Sc) |
| (17) | NCLUST = NCLUST + 1; |
| (18) | CL(Sc(j)) = NCLUST; |
| (19) | End |
| (20) | //Step 5.3. Assign the class label for non-center points |
| (21) | for k = 1 to number of datapoints |
| (22) | if (CL(ordrho(k)) = = −1) |
| (23) | CL(ordrho(k)) = CL(nneigh(ordrho(k))); //assign the non-center data points to the cluster with the nearest local density that is higher than its own. |
| (24) | end |