Research on Segmenting E-Commerce Customer through an Improved K-Medoids Clustering Algorithm
Algorithm 1
Implementation procedure of the improved K-medoids algorithm.
Input: dataset Y = {y1, y2, …, yn}, , where n is the number of data points.
Step 1: Randomly select one sample from the dataset as the initial clustering center C1.
Step 2: First, calculate the shortest distance D(x) between each sample and the existing clustering center. Second, calculate the probability P(x) that a sample is selected as the next clustering center. Calculate P(x), which yields to .
Third, a random number Ri is generated in the interval (0, 1), and calculate the difference between P(x) and Ri Finally, when the difference is less than or equal to 0 for the first time, the corresponding object is the next clustering center.
Step 3: Repeat Step 2 until K clustering centers are selected.
Step 4: Assign samples. Calculate the Euclidean distance between the remaining data points and the cluster center Ci, then find the shortest distance. Assign all samples to the clusters corresponding to the cluster center Ci.
Step 5: Update the cluster centers. Randomly select the non-central point Crandom and replace Ci with Crandom to update the cluster centroids of each cluster according to the principle of squared difference function value reduction.
Step 6: Repeat Step 4 and Step 5 until the cluster centers no longer change or the maximum number of iterations is reached, the cycle ends and the final clustering result is obtained.