Abstract
The diagnosis and treatment of epilepsy is a significant direction for both machine learning and brain science. This paper newly proposes a fast enhanced exemplar-based clustering (FEEC) method for incomplete EEG signal. The algorithm first compresses the potential exemplar list and reduces the pairwise similarity matrix. By processing the most complete data in the first stage, FEEC then extends the few incomplete data into the exemplar list. A new compressed similarity matrix will be constructed and the scale of this matrix is greatly reduced. Finally, FEEC optimizes the new target function by the enhanced -expansion move method. On the other hand, due to the pairwise relationship, FEEC also improves the generalization of this algorithm. In contrast to other exemplar-based models, the performance of the proposed clustering algorithm is comprehensively verified by the experiments on two datasets.
1. Introduction
Epilepsy is a common disease of nervous system, which is characterized by sudden brain dysfunction. Although there are many other neuroimaging modalities for the recognition of brain activity, EEG signals have a high temporal resolution which is up to the millisecond level, and its acquisition equipment is inexpensive, portable, and noninvasive. Nowadays, most diagnoses of epilepsy are based on clinical experience and the analysis of electroencephalogram (EEG) signals. Compared with manual diagnostic method, machine learning methods are less time-consuming and more consistent [1–6]. Specifically, many machine learning methods such as support vector learning [7, 8], Takagi–Sugeno–Kang (TSK) fuzzy system [9, 10], and Naïve Bayes [11] have been applied.
As we know that brain activity is a nonlinear, unstable complex network system, EEG signals we usually get are complicated. That is to say, some EEG signals are complete while others may miss some features, namely, incomplete. Therefore, recognition of epilepsy based on machine learning models will be more promising compared with clinical diagnosis depending on experience. Moreover, EEG signals have the characteristics of high dimension and stochasticity which limit the performance of most existing clustering models, such as k-means [11] and fuzzy c mean (fcm) [12]. K-means and fcm clustering models need to preset the number of clusters in advance. More specifically, the performance of the k-means model relies on the initialization of data, while the fcm model requires high interpretability. Thus, we focus on the exemplar-based clustering model [13] which is proposed by Frey in this paper. The exemplar-based clustering model has the advantages of automatically obtaining the cluster number, high efficiency, and not relying on the initialization of data.
In conclusion, we consider the scenario of EEG signals consisting of most complete data and few incomplete data in this paper, as shown in Figure 1. Based on the previous work about the recognition of epileptic signals, we propose a novel fast enhanced exemplar-based clustering (FEEC) model for incomplete EEG signals. As shown in Figure 1, different from existing exemplar-based clustering models, FEEC compresses the exemplar list and reduces the pairwise similarity matrix, and then FEEC optimizes the target model by the enhanced -expansion move framework. Moreover, the contributions of this paper can be highlighted as follows:(1)We extend the existing exemplar-based clustering algorithm into a fast version by compressing the potential exemplar lists in this study. FEEC compresses the number of potential exemplars by processing the most complete data in the first stage and extends the few incomplete data into the exemplar list. So, the complexity of FEEC is reduced as well.(2)Along with most existing exemplar-based clustering models, FEEC is built on the pairwise similarity matrix of data. Thus, after compression, FEEC would construct a new reduced similarity matrix, and the generalization of this algorithm is improved.(3)Moreover, this paper also considers the fact that the graph cuts [14] based optimization performs better than those loopy belief propagation (LBP) [15] based structure. So, the proposed FEEC algorithm optimizes the target model by the enhanced -expansion move framework [16, 17].(4)Experimental results of both synthetic and real-world datasets indicate the promising efficiency of the proposed FEEC algorithm.

The rest of this paper is listed as follows. In Section 2, we introduce some static exemplar-based clustering models. Section 3 discusses the proposed FEEC algorithm step by step. In Section 4, we analyze the experimental results and the comparison of FEEC and other existing methods. Section 5 concludes this whole paper.
2. Background
Since EEG signal feature extraction methods and exemplar-based clustering models are two important supporting theories for the FEEC model in this study, we will briefly introduce several feature extraction methods and exemplar-based clustering models in this section.
2.1. Feature Extraction Methods
Original EEG signals have the characteristics of high dimensionality, stochasticity, and nonlinearity. It would be computationally very expensive to extract features from raw EEG signals; nowadays, many feature extraction methods have been proposed to handle this problem. In sum, there are three categories, i.e., time-domain features, frequency-domain features, and time-frequency features.
More specifically, in time-domain analysis, statistics component features of the raw EEG signals will be analyzed [18]. In frequency-domain analysis, power spectrum analysis and short-time Fourier transform (STFT) [19, 20] are commonly used. In time-frequency analysis, time and frequency domain are simultaneously extracted from nonstationary EEG signals. Wavelet and other improved versions [21, 22] are widely used in EEG signal processing. We utilize KPCA to extract feature in this paper.
2.2. Exemplar-Based Clustering Models
Exemplar-based clustering models select cluster centers, namely, exemplars, from existing actual data. We focus on exemplar-based clustering models in this paper and briefly introduce affinity propagation (AP) [13] and enhanced -expansion move (EEM) [17] in this section. And several extended versions for different scenarios are shown in Table 1. The target fucntion defined by exemplar-based clustering model equals to the minimization problem of energy function of Markov random field(MRF). Two existing optimization startegies have been utilized and evolved into AP and EEM frameworks accordingly. Moreover, loopy belief propagation (LBP) [23] is used in AP, while graph cuts technique [15] is used in EEM, respectively.
2.2.1. Affinity Propagation
AP is based on message passing among data points, and its target function is defined as follows:wherewhere is an input dataset and N is the total number of D-dimensional data points. is the output of this framework, and the element is referred to the exemplar for each .
According to AP, each point receives availability message and sends responsibility message simultaneously, which are defined as follows:where S is the similarity matrix of data points and is defined as . Meanwhile, where is named as preferences in this framework. Moreover, its value should be independent and can be set to a constant.
AP does not require presetting the number of the cluster and the performance is stable. Considering these advantages, many extended versions of AP have been proposed [24, 25]. Specifically, AP defines fading factor to adjust the iteration speed; adAP [24] is proposed to determine this fading factor adaptively. Moreover, several extended versions of AP methods have been proposed to deal with large data and link constraints. For instance, IAPKM, IAPNA, and IAPC [26, 27] employ incremental strategy and semisupervised AP and SSAP [28] concentrate on instance-level constraints. A two-stage fast version of AP (FAP) [29] is also proposed to improve the efficiency. However, although AP has been obtaining its success in various applications, when we attempt to directly apply AP to incomplete EEG signals, the performance is unsatisfactory.
2.2.2. Enhanced -Expansion Move
In 2014, Zheng and Chen [17] utilized enhanced -expansion move framework to optimize the object function of exemplar-based clustering models and accordingly proposed the EEM clustering model. In line with the mathematical symbols above, the target function of EEM is defined as follows:where
In terms of [17], -expansion move algorithm has been proved to be effective in the optimization of the target function equation (5). Specifically, when , equation (7) is verified. Furthermore, according to graph theory, in the fast -expansion move algorithm, the expansion range is limited in a one exemplar. To break this limit, the EEM model enlarges the range to the whole exemplar set when optimizing and defines a second exemplar for each point as follows:where is the dataset among which the exemplar is and represents other exemplars in except for .
The EEM clustering model is a state-of-the-art achievement of exemplar-based clustering model and has been proved to be efficient and effective for numerous scenarios [16, 17, 30]. IEEM [30] is proposed to deal with link constraints by embedding a bound term in the target function. For dynamic data stream, Bi and Wang [16] also proposed an incremental EEM version DSC which processes data chunk by chunk. However, for incomplete EEG signals, these methods would not recognize epilepsy well.
3. Fast Enhanced Exemplar-Based Clustering Model
In this section, the proposed FEEC model will be stated and theoretically analyzed in detail. We first compress the exemplar list and reduce the pairwise similarity matrix, and then the target model is optimized by the enhanced -expansion move framework.
3.1. Framework
As mentioned in the introduction section, we focus on the incomplete EEG signals which consist of most complete data and few incomplete data. To improve the efficiency of the EEM clustering model for these incomplete EEG signals, the proposed FEEC framework includes two stages, namely, compression stage and optimization stage. As shown in Figure 2, the compression stage compresses the potential exemplar list and the optimization stage determines the optimal exemplars from the potential exemplar list. Accordingly, the target function can be defined as follows:where is the input dataset consisting of most complete data and few incomplete data . The total number of data is defined as , where and are the number of complete and incomplete data, respectively. Remember that we only consider the scenario that in this study. The second term in equation (9) guarantees the validity of the exemplar list; its definition is similar to that of in equation (2). In the end, represents the exemplar set in question.

In the compression stage, the number of potential exemplar list will be reduced by exemplar-based selection algorithm, namely, EEM method in this study. To be specific, we apply the EEM model on the most complete data to obtain the potential exemplars for these data. FEEC also pulls the few incomplete data into this potential exemplar list and then constructs compressed similarity matrix. Therefore, after compression, only the pairwise similarities between data and potential exemplars would be preserved. Considering that the FEEC method is built on the pairwise similarity matrix, the following clustering procedure would be applied on this compressed similarity matrix. Furthermore, the scale of similarity matrix is reduced from to , where and are the number of data and potential exemplars, respectively.
In the optimization stage, only the similarity relationship between data and potential exemplars is considered. The new target function after compression is similar to that of other exemplar-based clustering model, like equations (1) and (5), so we take graph cuts and LBP into account. Nevertheless, graph cuts based optimization framework outperforms LBP structure [31]. So, the proposed FEEC utilizes the -expansion move method to optimize the new target function. Moreover, along with EEM, FEEC also expands the expansion move space from a single data to the second optimal exemplar.
3.2. Compression Stage
In the compression stage, the target function of complete data can be defined as follows:where is the complete D-dimensional data and is the number of these data. is the potential exemplar list for most complete data, and the element among is referred to the potential exemplar for each . The optimization framework of other exemplar-based clustering models, like EEM, can be utilized to solve equation (10). In this paper, we select the graph cuts algorithm instead of message-passing algorithm to compress the potential exemplar list. Thus, the potential exemplar list for complete data can be determined, and the number of potential exemplars is defined as .
The potential exemplar list after compression stage would bewhere is the exemplar set for the few incomplete data, which is the incomplete data itself actually. That is to say, .
In this stage, we reduced the number of potential exemplars from to . In terms of the analysis in [13, 17, 30], the time complexity of this stage will be . Compared with the time complexity , if we apply exemplar-based clustering model directly considering the fact that , the time complexity of this compression algorithm would be acceptable.
Therefore, on the basis of the new exemplar list after compression, we can construct the new similarity matrix ; the element relates to the distance, namely, . The scale of the similarity matrix reduces from to , where represents the number of potential exemplars.
3.3. Optimization Stage
After compression, we define the new target function as follows:where is the new similarity matrix constructed after compression.
In this section, we construct an optimization framework for equation (12). The second term of equation (12) is set to guarantee the validity of the exemplar list; in order to utilize the graph cuts based method, this term should be pairwise [17]. So, is modified as . Furthermore, similar to equation (5), we define as follows:
It has been proved that with the definition of , equation (12) can be optimized by the enhanced -expansion method [30]. To improve the efficiency of framework, this method enlarges the expansion move to the second optimal exemplar.
Before optimization, we explain several symbols involved. First, we define as those data with exemplar and as the current potential exemplar. Then, the enhanced -expansion move method considers the second optimal exemplar, which is defined aswhere is the potential exemplar list except for .
Apparently, this optimization method should consider two cases, namely, is among exemplar list or not, as shown in Figures 3 and 4. To be specific, Figure 3 illustrates the case when is an exemplar, while Figure 4 shows the case when is not an exemplar. Remember that only when is a potential exemplar, works. We utilize the concepts of “energy reduction” because this method was first used to optimize the Markov random field (MRF) energy function.


In the situation shown in Figure 3, either changes its exemplar to or nothing is changed. Therefore, the energy reduction would be defined aswhere is the energy reduction when changes its exemplar to and is defined as
On the other hand, as shown in Figure 4, a new exemplar should be considered. Whether to accept the new exemplar is decided by the energy reduction , which will be discussed next. First, we assume a new exemplar is accepted. In fact, the following procedure is similar to that shown in Figure 3. Specially, the remaining data would change its exemplar to either or . For data in cluster , theoretical analysis proves that only when the exemplar changes its exemplar, would change its exemplar as . In this case, the energy reduction is defined as follows:
Otherwise, some data in cluster may change their exemplar as ; we define these data as , and the corresponding energy reduction is defined in the following equation:
Then, the energy reduction is defined as follows:
In sum, the new target function equation (12) is optimized, and the optimal exemplar list for the EEG signals is generated.
3.4. Time Complexity and Description
The similarity relationship can be measured by Euclidean distance between data, defined as in this study. The proposed algorithm FEEC consists of two stages, namely, compression stage and optimization stage. After compression, the scale of similarity matrix reduces from to , so the optimization stage has the time complexity of . Therefore, the complexity of FEEC is considerably promising.
Based on the theoretical analysis above, the proposed FEEC for incomplete data can be summarized as Algorithm 1.
|
4. Experimental Study
To comprehensively evaluate the proposed algorithm FEEC, we have conducted several experiments based on both synthetic and real datasets. We also compare our new model with basic exemplar-based clustering model, namely, AP and EEM; to show these experimental results, we choose four performance indices in this section. In our experiments, all the algorithms were implemented using 2010a Matlab on a PC with 64 bit Microsoft Windows 10, an Intel(R) Core(TM) i7-4712MQ, and 8 GB memory.
4.1. Data Preparation
We choose Aggregation [32], as shown in Figure 5, and the Bonn EEG signal datasets in this section. The Bonn dataset [9, 10] is from the University of Bonn, Germany (http://epileptologie-bonn.de/cms/upload/workgroup/lehnertz/eegdata.html). The EEG dataset contains five groups (A to E and each group contains 100 single channel EEG segments of 23.6s duration. The sampling rate of all the datasets was 173.6 Hz. Figure6 shows five healthy and epileptic EEG signals, and Table 2 lists detailed descriptions of these signals. Table 3 shows a brief description of these datasets. To construct the incomplete data scenario, we randomly choose 80% data as complete data and the remaining 20% as the incomplete data. We utilize KPCA to extract features from EEG signals in this section.


4.2. Performance Indices
Here, we give the definitions of the three adopted performance indices ENERGY, NMI, and accuracy. Along with the description in [12, 16, 30, 33, 34], we call the result outputted by these involved models as cluster and the true labels as class.
4.2.1. ENERGY
Since all the mentioned clustering algorithms are optimized, respectively, by the energy functions of the same type, we can compare them in terms of their energy values, defined as follows:where denotes the kth exemplar, is the ith data point in kth cluster, and is the Euclidean distance between and which can be seen as a measurement of energy.
4.2.2. NMI
NMI has been widely used to evaluate the clustering quality as well, and its value can be calculated by the following equation:where is how clusters fit the classes, is the number of data points in ith cluster, is the number of data in jth class, and N is the total number of data points.
4.2.3. Accuracy
Accuracy is a more direct measure to reflect the effectiveness of clustering algorithms, which is defined aswhere is the real label of data points and is the obtained clustering label. , if ; , otherwise. Function maps each obtained cluster to real class, and the optimized mapping function can be found in Hungarian algorithm.
The values of NMI and Acc range from 0 to 1, and the more it is close to 1, the more effective the clustering algorithm is. What is worth to mention is that we put % in the following relevant tables to show better precision. As to the performance index ENERGY, the smaller the value is, the better the clustering algorithm is.
4.3. Experimental Results and Discussion
The parameters involved FAP, AP, and EEM are in line with [13, 17, 29]. The preference is set to be the median value of similarities between data. We run each algorithm over 10 runs under same parameters; the average results are shown in Table 4. Moreover, the detailed comparison in terms of the above 3 terms, NMI, accuracy, and ENERGY, are shown in Figures 7–12 and Table 4, respectively.






By analyzing Figures 7–11 and Table 4 in detail, we can conclude the following:(1)The proposed algorithm FEEC can cluster data with 80% complete data and 20% incomplete data, and in most cases, the performance is very convincing. Specifically, for both Aggregation and epileptic EEG signals, FEEC performs best, in terms of NMI, accuracy, and ENERGY.(2)As to the computational time, compared with EEM, FEEC takes less time. Thus, with the assistance of compression stage, the time complexity of FEEC has been reduced, and the efficiency is improved as well. And FEEC has an equivalent computational time with FAP. Comparing other criteria of FEEC and FAP, it is worthwhile to spend more time.(3)The proposed FEEC needs no more parameters except for preferences, while the performance of FAP relies much on k, which determines the number of nearest exemplars. Accordingly, in terms of the involved datasets in this section, FEEC would achieve satisfactory clustering results.
5. Conclusions
The diagnosis and treatment of epilepsy is always a significant direction for both machine learning and brain science. This paper newly proposes a fast exemplar-based clustering method for incomplete EEG signal. The FEEC method includes two stages, namely, compression and optimization. The performance of the proposed clustering algorithm is comprehensively verified by the experiments on two datasets.
Although most recognition methods of epilepsy are based on EEG signals at present, researchers also have to study on other neuroimaging modalities, such as cortical electroencephalography (ECoG), functional infrared optical imaging (fNIR), functional magnetic resonance imaging (fMRI), positron emission tomography (PET), and magnetoencephalography (MEG). Considering the fact that the brain activity is a nonlinear, networked, and unstable complex system, we would focus on the multimodal clustering model for these neuroimaging modality signals in future.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This study was supported in part by the 2018 Natural Science Foundation of Jiangsu Higher Education Institutions under grant no. 18KJB5200001, the Natural Science Foundation of Jiangsu Province under grant no. BK20161268, and the Humanities and Social Sciences Foundation of the Ministry of Education under grant no. 18YJCZH229.