Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach

Baragilly, Mohammed H.; Gabr, Hend; Willis, Brian H.

doi:https://doi.org/10.1155/2023/8849404

Journal of Probability and Statistics

On this page

Abstract Introduction Data Availability Disclosure Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2023 | Article ID 8849404 | https://doi.org/10.1155/2023/8849404

Clustering Analysis of Multivariate Data: A Weighted Spatial Ranks-Based Approach

Mohammed H. Baragilly,^1,2Hend Gabr,^3,4and Brian H. Willis²

Academic Editor: Hyungjun Cho

Received15 Dec 2022

Revised27 Mar 2023

Accepted05 Jun 2023

Published30 Sept 2023

Abstract

Determining the right number of clusters without any prior information about their numbers is a core problem in cluster analysis. In this paper, we propose a nonparametric clustering method based on different weighted spatial rank (WSR) functions. The main idea behind WSR is to define a dissimilarity measure locally based on a localized version of multivariate ranks. We consider a nonparametric Gaussian kernel weights function. We compare the performance of the method with other standard techniques and assess its misclassification rate. The method is completely data-driven, robust against distributional assumptions, and accurate for the purpose of intuitive visualization and can be used both to determine the number of clusters and assign each observation to its cluster.

1. Introduction

In recent years, there has been a significant advancement in clustering methods in the field of data science and machine learning. Density-based spatial clustering of applications with noise (DBSCAN) has become one of the popular nonparametric clustering algorithms that group together points that are close to each other and are surrounded by low-density areas [1]. Also, hierarchical clustering [2] has been widely used to build a hierarchy of clusters by either merging smaller clusters into larger ones or splitting larger clusters into smaller ones. The K-means clustering [3] remains one of the most popular partitioning methods, where the data are partitioned into K distinct, nonoverlapping clusters.

In addition, spectral clustering [4] has gained popularity due to its ability to use the eigenvectors of the similarity matrix to partition the data into clusters. Affinity propagation [5] assigns each data point to an exemplar, which is a representative point of a cluster, and iteratively updates the exemplars until convergence. Fuzzy clustering [6] assigns each data point a probability of belonging to each cluster, rather than assigning it to a single cluster. The K-medoids algorithm introduced by [7] uses actual data points as representatives or medoids for each cluster, and medoids selected from the dataset are typically the most centrally located points within a cluster. This makes K-medoids more robust to outliers and noise compared to K-means. Another recent clustering method is the distance density clustering (DDC) introduced by [8]. It is a distance density clustering method that is a medoid-based clustering with time series data density consideration which provides clustering results in a hierarchy fashion.

The clustering by the fast search and find of density peaks (densityClust) method [9] is a density-based clustering method that aims to identify clusters based on the local density of data points and the distances between them, which makes it suitable for datasets with irregularly shaped clusters and varying densities. Furthermore, neighborhood grid clustering (NGC) introduced by [10] is a density-based clustering algorithm that aims to identify clusters based on the local density of data points and their spatial relationships. It utilizes a grid-based approach to partition the data space into cells and then assigns data points to their corresponding cells. The algorithm then determines the cluster assignments by considering the density and spatial proximity of the points within each cell.

These methods have their own strengths and weaknesses and are suitable for different types of data and applications. The choice of the appropriate clustering method depends on various factors such as the size and nature of the data, the desired number of clusters, and the research question being addressed.

Further advancements have been made in the use of spatial ranks to analyse multivariate data, with the development of nonparametric methods being particularly notable [11–13]. They have a number of attractive features including being distribution-free and easy to compute. Furthermore, the traditional spatial ranks function gives information about how central each observation is and its direction in relation to the centre. However, they do not capture the distances between each pair of observations, which is important for cluster analysis.

Recently, Baragilly and Chakraborty [11] utilized spatial ranks as a clustering tool, using a forward search methodology based on nonparametric multivariate spatial rank functions to determine the number of clusters in the data. Their method does not depend on the choice of the initial subsample and has been shown to perform well in different mixture distributions. This work has been extended to clustering functional datasets in medical applications [14].

This paper proposes a novel approach to utilizing spatial ranks for clustering, using a nonparametric weighted spatial ranks function that takes into account the distances between each pair of observations as weights and defines a dissimilarity measure based on spatial ranks. By measuring the distances between each pair of observations instead of their central tendency, it becomes easier to segment a given set of data into a specific number of clusters.

The main idea behind the weighted spatial ranks (WSRs) is to define a dissimilarity measure based on a localized version of spatial ranks, such that the weighted ranks can be used as a classifier and a confirmatory tool to determine the number of clusters and assign each observation to its cluster. Proper selection of a weight function can lead to better identification of clusters, and kernel weights are a popular choice in pattern analysis, classification, cluster analysis, machine learning, and support vector machines.

The paper also demonstrates how weighted spatial ranks may be used for the purpose of visualizing the clusters, so that the number of clusters may be determined using weighted rank contours for a low-dimensional input space after dimension reduction.

Section 2 of the paper introduces the weighted spatial rank function and evaluates its use for different parametric and nonparametric weight functions. Section 3 demonstrates the weighted rank-based clustering algorithm and proposes a confirmatory classifier based on weighted ranks that can be used to assign observations to the most appropriate cluster for two-dimensional data. Section 4 demonstrates the weighted rank-based clustering algorithm to higher dimensional data, and Section 5 provides numerical examples based on both simulated and real datasets to examine the performance of the proposed algorithm. The algorithm is compared with other clustering methods in Section 6, and concluding remarks are presented in Section 7.

2. Weighted Spatial Rank Functions

In this section, we propose two different weighted spatial rank functions. Suppose that has a -dimensional distribution of F, which we assume to be continuous hereon, then the unweighted spatial rank function of the point with respect to F can be defined as

The first weighted spatial rank function of the point with respect to F is a vector function and can be defined as

The second weighted spatial rank function can be defined aswhere their L₂ norms are

Note that, the difference between (2) and (3) lies with the scale (denominator) which in (2) is the sum of the weights of where and is therefore data dependent, whereas in (3) it depends on n which is data independent.

Kernel weight functions are often used in nonparametric estimation and, as already indicated, in a range of classification and pattern recognition problems. Here, we consider the Gaussian kernel weight, which is one of the commonly used nonparametric kernel weight functions and it is defined as follows:where is the Euclidean norm such that is the direction of the d-dimensional vector (for comprehensive details on kernel weights, see Souza [15])

3. Weighted Spatial Ranks Clustering Algorithm

We now introduce the weighted spatial ranks clustering algorithm starting with the bivariate case () before considering the higher dimensional case .

3.1. Weighted Spatial Ranks Clustering Algorithm for a Bivariate Case (d = 2)

(1)Let be a random sample with two variables and and let be the Cartesian product of two equally spaced sets, and so that each is a two-dimensional vector.(2)For each , we calculate with respect to aswhere and .(3)Plot the weighted functional spatial ranks contour based on and , and then determine the number of clusters K from the contour lines.(4)Based on the contour lines, specify the assigned observations in each cluster. You can use a lower contour level for better visualization.(5)Use the weighted spatial rank classifier’s rule defined in Section 3.2 to confirm the assignment of each observation and allocate the unassigned observations to the proper cluster.

3.2. Weighted Spatial Ranks Classifier

Suppose that we have k groups, with distributions , then based on the second WSR function in (3), we can assign d-dimensional observation vector to the -th group ifwhere . Note, if we had used , then since increases outward from the spatial median, we have

So, after determining the number of clusters using the WSR contour, (8) may be used to assign each observation to the most suitable cluster.

3.3. Confirmatory Analysis Based on the Weighted Spatial Ranks Clustering Algorithm

In order to assess the weighted spatial rank functions’ performance, we compare them with other standard methods such as Euclidean distances, Mahalanobis distances, spatial ranks, and spatial depth. A simulation study is used to assess the performance of the proposed weighted ranks-based clustering in defining the group structure in multivariate data. The simulated data are sampled from a bivariate normal mixture distribution that is assumed to cluster into two groups, where the mixture proportion p = 0.3 and sample size n = 1000 such that , is a random sample fromwhere , , and .

For all the contour plots that follow, they are derived from a random sample of 1000 observations simulated from the bivariate mixture normal distribution in (10). Figure 1 shows the contour plots of the Euclidean distances, Mahalanobis distances, spatial ranks, and the spatial depth. The figure clearly shows that the contours produced have failed to map the shape of the two clusters’ structure in the bivariate mixture distribution.

(a)

(b)

(c)

(d)

In Figure 2, (defined in (4)) and (defined in (5)) are used to derive the contour plots based on the nonparametric Gaussian kernel weights function defined in (6) (see section 4). In general, Figure 2 reveals that compared with Figure 1, the contours produced from both and based on the Gaussian kernel function capture more of the structure of the simulated data. However, the contour based on failed to detect some of the observations that are not close to either of the clusters. These undetected observations are indicated by some lines between the two clusters, suggesting the potential presence of a third cluster. In contrast, it is clear that the cluster structure of the data is better defined by using the as compared to that from the in Figure 2(a).

(a)

(b)

This is because is a constant and the values increase outward from the centre or spatial median of the cluster. Thus, unassigned points may be assigned to that cluster which results in the lowest weighted rank for the point. In contrast, the contour derived from the second normed weighted spatial rank function, , defined in (5), is based on the values of that decrease outward from the spatial median. Thus, the larger the weighted spatial rank value for an individual point, the closer it is to the centre of the cluster.

In summary, is more successful than at capturing the cluster structure of the simulated data. Overall, the most accurate contour plot is synthesized when uses Gaussian kernel weights.

4. Weighted Spatial Ranks Based Clustering Algorithm for Higher Dimensions (d > 2)

For real-world datasets, we often have to analyse complex multidimensional data and this makes data visualization and computation more complicated. In such cases, a dimension reduction strategy may be employed and here we use the principal component analysis (PCA) to reduce the dimensionality of the data to two dimensions in order to derive a contour plot (see [16]).

The main idea of using PCA is to find a lower-dimensional subspace that captures most of the variance in the data. Specifically, it involves finding the orthogonal rotation of the axes that maximises the variance. For a d-dimensional random variable X = (X₁, X₂, …, X_d) with covariance matrix , let , where is a d-dimensional vector of constants. Since the variance of , to find the principal component, requires finding

It is necessary to constrain to have a unit length to ensure finite values. This may be solved by using the method of Lagrange multipliers, so that for Lagrange multiplier, , this reduces to solving the eigen equation.

It also means that for the component which yields the largest eigenvalue, has the largest variance and is the corresponding eigenvector. It is also straightforward to show that the for j = 1, …, d are orthogonal. Thus, for the matrix of eigenvectors, and the matrix of principal component scores C is given by . This represents a rigid rotation of the X axes to an orientation of maximum variance. Thus, the first principal component C₁ has the largest variance for X, and the second component C₂ has the second largest variance and so on (for more details on PCA, see [17]).

4.1. Weighted Spatial Ranks Clustering Algorithm When d > 2

(1)Let be a d-dimensional random sample, then use the PCA to get the first two components ₁ and ₂ of that sample and construct the matrix , which is a matrix consisting of the two components ₁ and ₂.(2)Consider ₁ and ₂ as the new variables and perform the steps of the weighted spatial ranks clustering algorithm when .

5. Numerical Examples

In this section, we apply the weighted spatial ranks-based clustering algorithm to two simulated datasets and three real datasets.

5.1. Simulated Data Examples

In the first simulated data example, we consider a mixture of three quadvariate normal distributions, with mixing proportions, and , and sample size , such that is a random sample from 4-dimensional mixture normal distribution.where , , and .

Figure 3(a) shows the scatterplot matrix of the principal components and reveals a mixed picture. It is clear from the component 1 versus component 2 panels that there are 3 clusters. In contrast, the component 2 versus component 3 and the component 2 versus component 4 suggest that there are only 2 clusters. Figure 3(b) gives the proportion of the total variance explained by each component, i.e., 97% of the total variance is explained by the first two components. Figures 3(c) and 3(d) demonstrate that the weighted spatial ranks contour accurately fits the shape of the three clusters without any misclassification. Finally, in Figures 3(e) and 3(f), the confirmatory plots based on the weighted ranks classifier for the first two components show that the observations have been correctly assigned to the three simulated clusters.

(a)

(b)

(c)

(d)

(e)

(f)

In the second example, we simulate a random sample of size n = 100 from a mixture of four 6-dimensional normal distributions, with equal proportions of weights p = 0.25, i.e.,with , , , and .

From the scatter plot matrix in Figure 4(a), we can see that whilst there are four clear clusters in the component 1 versus component panels, the number of clusters is less clear in the other panels. However, it is clear from Figure 4(b) that the first two components explain the majority (98%) of the variance.

(a)

(b)

(c)

(d)

(e)

(f)

The weighted spatial ranks contour plots shown in Figures 4(c) and 4(d) clearly reveal the shape of the four clusters, where in the latter, a lower contour level = 0.001 has been used. Finally, the confirmatory plots based on the weighted spatial ranks classifier for the first two components and the original data shown in Figures 4(e) and 4(f) demonstrate the correct assignment of the simulated observations to the right clusters.

5.2. Real Datasets’ Examples

In this subsection, the algorithm is applied to three real datasets: the iris data [18], financial data [19], and old faithful geyser data [20, 21]. The iris dataset consists of three different types of irises (Setosa, Versicolour, and Virginica). However, most of the clustering techniques consider there to be two groups, since iris Virginica and iris Versicolour are not separable without the species information that Fisher used. As we can see from Figure 5(c), the weighted ranks contour based on the first two components, which explain 97.8% of the total variances, indicates two clusters. The confirmatory plots in Figures 5(e) and 5(f) assign all of the observations to two groups.

(a)

(b)

(c)

(d)

(e)

(f)

The second real dataset is the financial data [19], which contains measurements of the three variables monitoring the performance of 103 investment funds operating in Italy since April 1996 (Table A.16 of Atkinson et al. [19]). These data include two different kinds of funds (stock funds and balanced funds). From Figure 6(c), the weighted ranks contour of components 1 and 3, which explain 96.4% of the total variances, suggests there are two clusters. Moreover, the confirmatory plots provide a valid assignment of the observations, which is consistent with the two types of funds.

(a)

(b)

(c)

(d)

(e)

(f)

The third dataset, the old faithful geyser data, is taken from Azzalini and Bowman [20] and the MASS library of Venables and Ripley [21]. It includes 272 observations and two variables, the waiting time between eruptions, and the duration of the eruption in minutes for the old faithful geyser in Yellowstone National Park, Wyoming, USA. This dataset consists of two apparent clusters, the short and the long eruptions. From Figures 7(b) and 7(c), it can be seen that the weighted ranks contour of the data indicates two clusters with an unassigned observation (number 174). Using the confirmatory classifier shown in Figure 7(d), observation 174 is correctly assigned to the second cluster.

(a)

(b)

(c)

(d)

6. Comparison with Other Clustering Methods

The WSRN method determines the number of clusters in a dataset and classifies the data into each of the clusters. In this section, we compare the WSRN method with other clustering and classification methods.

The first method is the model-based clustering “mclust” [22]. This is based on a Gaussian mixture model GMM [23] where the number of clusters corresponds to the model which returns the largest Bayesian information criterion (BIC). The second method is the K-means algorithm combined with the Calinski–Harabasz (CH) index [24]. The number of clusters that returns the highest CH index is selected before applying the K-means [25] algorithm to classify the data.

The third method used as a comparator is the high-dimensional data clustering (HDDC) [26] which is again a clustering method based on the Gaussian mixture model where the BIC is used to select the number of clusters. The fourth method used is the mixture of probabilistic principal component analyses “MixtPPCA” [27] where the number of clusters corresponds to the largest BIC. The fifth method for comparison is the partitioning around medoids “PAM” clustering [28] method, where the number of clusters is estimated based on the optimum average silhouette width [29]. The sixth method used in the comparison is the density-based spatial clustering of applications with noise (DBSCAN), where the number of clusters is estimated using a density-based approach to identify regions of high density in the data and these are considered as clusters [1]. Other methods that have been used in the comparison are KMD: clustering with K-medoids [7], FCM: fuzzy C-means clustering [30], GG: Gath–Geva clustering algorithm [31], DDC: distance density clustering [8], SNN: clustering with shared nearest neighbor clustering [32], and densityClust: clustering by fast search and find of density peaks [9].

Each method was applied to the three real datasets in Section 5. As the external classes are known, the different clustering methods were compared using the purity, entropy, and the misclassification rate. Although the purity and entropy are external validation methods commonly used in classification, they measure the homogeneity of the data in clusters and do not penalize algorithms that identify the incorrect number of clusters. Indeed if each cluster is homogeneous for a particular class, both the purity and entropy will assign a perfect score (1 for purity, 0 for entropy) even if the number of clusters is incorrect. The following misclassification rate does penalize algorithms which identify the incorrect number of clusters.

Let there be n data points where there are r true classes such that T = {T₁, T₂, …, T_r} and in which the algorithm identifies k clusters such that C = {C₁, C₂, …, C_k}. Let A = {1, 2, …, k} and B = {1, 2, …, r}. The misclassification rate H is defined assubject to the constraint that if two terms and appear in the sum, then i = t if and only if j = u [14, 33].

Thus, each row and column of the matrix A B contribute at most one element to the sum. A consequence of this is that is set to zero if is one of the terms that maximises the sum in parentheses. Also, when there is only one cluster, the sum contains one term only.

The adjusted Rand index (ARI) is another commonly used metric for evaluating the performance of clustering algorithms [34]. While H compares clusters based on set matching, ARI assesses clusters by counting point pairs where there is agreement or disagreement. Moreover, ARI takes into account the expected value of the unadjusted Rand index, which is determined by randomly selecting entries in the contingency table with fixed column and row totals.

Other cluster validity indices can be used to evaluate the goodness of the different clustering structures such as the connectivity index [35], CS index [36], and Sym index [37] (for more extensive details, see [38]).

Table 1 shows the results of the different algorithms applied to the iris dataset, where nine of the twelve methods recorded perfect scores for the purity and entropy, despite only 4 algorithms identifying the correct number of clusters. For these data, the WSR, mclust, and densityClust have the best misclassification rate, and H and both have perfect entropy, purity, and ARI scores.

For the financial dataset, as shown in Table 2, the WSR with seven other algorithms have the joint lowest misclassification rates. The HDDC algorithm records the best scores for the purity and entropy but this identifies an incorrect number of clusters. Based on the ARI, WSR, K-means, MixtPPCA, PAM, KMD, FCM, and GG record the best scores. Finally, for the old faithful dataset, Table 3 shows that the WSR algorithm has the joint third lowest H, but the HDDC and GG algorithms are the best across all the four metrics for this dataset.

7. Concluding Remarks

In this paper, we have introduced a new clustering method based on weighted spatial ranks. The WSRN algorithm is completely data-driven and it both determines the number of clusters and classifies the data. As a nonparametric method, it does not require any assumptions to be made on the underlying distribution(s) of the data. The synthesis of weighted rank contours, based on principal components analysis when the data have more than two dimensions, allows the intuitive visualization of the cluster structure in relation to the distribution of the data points.

We considered nonparametric kernel weights and we introduced WSRN functions based on Gaussian kernel weights. Compared to other standard approaches, the WSRN function based on the Gaussian kernel weights provided the best results in terms of cluster detection and visualization. The weighted rank contours based on Gaussian weights were more accurate and provided the best fit to the shape of the clusters’ structure. They captured each observation carefully and assigned it to the proper group with a minimal probability of misclassification. It also performed competitively with other methods when clustering and classifying the data from three real datasets.

Although the WSRN method is invariant under orthogonal transformations, it is not an affine invariant. Using affine invariant ranks has the potential to improve the results if the scales of different clusters are not similar [39]. A further possible extension to the method would be to consider generalizations of the Euclidean norm for estimating the WSRN. Thus, different Lp norms for could be investigated to establish the optimal value for .

Data Availability

The data used to support the findings of the study are available in the public domain and are appropriately referenced in this article.

Disclosure

This paper is a modified version of part of the published work in MB’s Thesis (Baragilly [40]).

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank Biman Chakraborty for his helpful discussions and suggestions relating to this work, and BHW and MB were supported by a Clinician Scientist award with the Medical Research Council, UK (MR/N007999/1).

References

M. Ester, H. P. Kriegel, J. Sander, and X. Xu, “A density-based algorithm for discovering clusters in large spatial databases with noise,” in Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 226–231, Portland Oregon, August 1996.
View at: Google Scholar
A. K. Jain and R. C. Dubes, Algorithms for Clustering Data, Prentice-Hall, Inc, Hoboken, NY, USA, 1988.
J. B. Macqueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol. 1, pp. 281–297, 1967.
View at: Google Scholar
A. Y. Ng, M. I. Jordan, and Y. Weiss, “On spectral clustering: analysis and an algorithm,” Advances in Neural Information Processing Systems, pp. 849–856, 2002.
View at: Google Scholar
B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” Science, vol. 315, no. 5814, pp. 972–976, 2007.
View at: Publisher Site | Google Scholar
J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York, NY, USA, 1981.
H. S. Park and C. H. Jun, “A simple and fast algorithm for K-medoids clustering,” Expert Systems with Applications, vol. 36, no. 2, pp. 3336–3341, 2009.
View at: Publisher Site | Google Scholar
R. Ma and R. Angryk, “Distance and density clustering for time series data,” in Proceedings of the 2017 IEEE International Conference on Data Mining Workshops (ICDMW), IEEE, New Orleans, LA, USA, November 2017.
View at: Google Scholar
A. Rodriguez and A. Laio, “Clustering by fast search and find of density peaks,” Science, vol. 344, no. 6191, pp. 1492–1496, 2014.
View at: Publisher Site | Google Scholar
M. Suo, B. Zhu, D. Zhou, R. An, and S. Li, “Neighborhood grid clustering and its application in fault diagnosis of satellite power system,” Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, vol. 233, no. 4, pp. 1270–1283, 2019.
View at: Publisher Site | Google Scholar
M. Baragilly and B. Chakraborty, “Determining the number of clusters using multivariate ranks,” in Recent Advances in Robust Statistics: Theory and Applications, C. Agostinelli, A. Basu, P. Filzmoser, and D. Mukherjee, Eds., pp. 19–36, Springer, India, 2016.
View at: Google Scholar
J. Möttönen and H. Oja, “Multivariate spatial sign and rank methods,” Journal of Nonparametric Statistics, vol. 5, no. 2, pp. 201–213, 1995.
View at: Publisher Site | Google Scholar
S. Sirkiä, S. Taskinen, H. Oja, and D. E. Tyler, “Tests and estimates of shape based on spatial signs and ranks,” Journal of Nonparametric Statistics, vol. 21, no. 2, pp. 155–176, 2009.
View at: Publisher Site | Google Scholar
M. Baragilly, H. Gabr, and B. H. Willis, “Clustering functional data using forward search based on functional spatial ranks with medical applications,” Statistical Methods in Medical Research, vol. 31, no. 1, pp. 47–61, 2021.
View at: Publisher Site | Google Scholar
C. R. Souza, “Kernel functions for machine learning applications,” 2010, http://crsouza.blogspot.com/2010/03/kernel-functions-for-machine-learning.html.
View at: Google Scholar
I. T. Jolliffe, Principal Component Analysis, Springer, Berlin, Germany, 2002.
W. J. Kzanowski, Principles of Multivariate Analysis, OUP, Oxford, UK, 1988.
R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Annals of Eugenics, vol. 7, no. 2, pp. 179–188, 1936.
View at: Publisher Site | Google Scholar
A. C. Atkinson, M. Riani, and A. Cerioli, Exploring Multivariate Data with the Forward Search, Springer, NewYork, NY, USA, 2004.
A. Azzalini and A. Bowman, “A look at some data on the old faithful geyser,” Applied Statistics, vol. 39, no. 3, pp. 357–365, 1990.
View at: Publisher Site | Google Scholar
W. Venables and B. Ripley, Modern Applied Statistics with S, Springer, New York, NY, USA, 2002.
C. Fraley and A. E. Raftery, “Model-based clustering, discriminant analysis, and density estimation,” Journal of the American Statistical Association, vol. 97, no. 458, pp. 611–631, 2002.
View at: Publisher Site | Google Scholar
J. D. Banfield and A. E. Raftery, “Model-based Gaussian and non-Gaussian clustering,” Biometrics. JSTOR, vol. 21, p. 803, 1993.
View at: Google Scholar
T. Calinski and J. Harabasz, “A dendrite method for cluster analysis,” Communications in Statistics-Theory and Methods, vol. 3, pp. 1–27, 1974.
View at: Publisher Site | Google Scholar
S. P. Lloyd, “Least squares quantization in PCM. Technical note, bell laboratories,” IEEE Transactions on Information Theory, vol. 28, pp. 128–137, 1957.
View at: Google Scholar
C. Bouveyron, S. Girard, and C. Schmid, “High dimensional data clustering,” Computational Statistics & Data Analysis, vol. 52, no. 1, pp. 502–519, 2007.
View at: Publisher Site | Google Scholar
M. E. Tipping and C. Bishop, “Mixtures of probabilistic principal component analyzers,” Neural Computation, vol. 11, no. 2, pp. 443–482, 1999.
View at: Publisher Site | Google Scholar
A. Reynolds, G. Richards, B. de la Iglesia, and V. Rayward-Smith, “Clustering rules: a comparison of partitioning and hierarchical clustering algorithms,” Journal of Mathematical Modelling and Algorithms, vol. 5, no. 4, pp. 475–504, 2006.
View at: Publisher Site | Google Scholar
L. Kaufman and P. J. Rousseeuw, Finding Groups in Data. An Introduction to Cluster Analysis, John Wiley and Sons Inc, New York, NY, USA, 1990.
A. J. Torabi, M. J. Er, X. Li, B. S. Lim, and G. O. Peen, “Application of clustering methods for online tool condition monitoring and fault diagnosis in high-speed milling processes,” IEEE Systems Journal, vol. 10, no. 2, pp. 721–732, 2016.
View at: Publisher Site | Google Scholar
K. Yu, T. R. Lin, and J. W. Tan, “A bearing fault diagnosis technique based on singular values of EEMD spatial condition matrix and Gath-Geva clustering,” Applied Acoustics, vol. 121, pp. 33–45, 2017.
View at: Publisher Site | Google Scholar
L. Ertoz, M. Steinbach, and V. Kumar, “Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data,” in Proceedings of the SIAM International Conference on Data Mining, pp. 47–59, Minnesota, MI, USA, April 2003.
View at: Google Scholar
M. Meila, “Comparing clusterings—an information based distance,” Journal of Multivariate Analysis, vol. 98, no. 5, pp. 873–895, 2007.
View at: Publisher Site | Google Scholar
L. Hubert and P. Arabie, “Comparing partitions,” Journal of Classification, vol. 2, no. 1, pp. 193–218, 1985.
View at: Publisher Site | Google Scholar
J. Handl and J. Knowles, “An evolutionary approach to multiobjective clustering,” IEEE Transactions on Evolutionary Computation, vol. 11, no. 1, pp. 56–76, 2007.
View at: Publisher Site | Google Scholar
C. H. Chou, M. C. Su, and E. Lai, “A new cluster validity measure and its application to image compression,” Pattern Analysis & Applications, vol. 7, no. 2, pp. 205–220, 2004.
View at: Publisher Site | Google Scholar
S. Bandyopadhyay and S. Saha, “A point symmetry-based clustering technique for automatic evolution of clusters,” IEEE Transactions on Knowledge and Data Engineering, vol. 20, no. 11, pp. 1441–1457, 2008.
View at: Publisher Site | Google Scholar
A. José-García and W. Gómez-Flores, “Automatic clustering using nature-inspired metaheuristics: a survey,” Applied Soft Computing, vol. 41, pp. 192–213, 2016.
View at: Publisher Site | Google Scholar
B. Chakraborty, “On affine equivariant multivariate quantiles,” Annals of the Institute of Statistical Mathematics, vol. 53, no. 2, pp. 380–403, 2001.
View at: Publisher Site | Google Scholar
M. H. H. Baragilly, Clustering Multivariate and Functional Data Using Spatial Rank Functions, University of Birmingham, Birmingham, UK, 2016.

Copyright

Copyright © 2023 Mohammed H. Baragilly et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies