Abstract

Atrial fibrillation (AF), as one of the most common arrhythmia diseases in clinic, is a malignant threat to human health. However, AF is difficult to monitor in real time due to its intermittent nature. Wearable electrocardiogram (ECG) monitoring equipment has flourished in the context of telemedicine due to its real-time monitoring and simple operation in recent years, providing new ideas and methods for the detection of AF. In this paper, we propose a low computational cost classification model for robust detection of AF episodes in ECG signals, using RR intervals of the ECG signals and feeding them into artificial neural network (ANN) for classification, to compensate the defect of the computational complexity in traditional wearable ECG monitoring devices. In addition, we compared our proposed classifier with other popular classifiers. The model was trained and tested on the AF Termination Challenge Database and MIT-BIH Arrhythmia Database. Experimental results achieve the highest sensitivity of 99.3%, specificity of 97.4%, and accuracy of 98.3%, outperforming most of the others in the recent literature. Accordingly, we observe that ANN using RR intervals as an input feature can be a suitable candidate for automatic classification of AF.

1. Introduction

Atrial fibrillation (AF), a major cardiac arrhythmia abnormality in the clinic, is associated with substantial complications that threaten people’s health [1, 2], such as hypertension, diabetes, heart failure, and cardiovascular disease [35]. The reason that AF real-time monitoring has gained much attention is not only because AF causes a high mortality rate but also because the duration of AF is relatively short to capture. Hence, we urgently need an automated AF real-time detection mechanism that could analyze massive amounts of electrocardiogram (ECG) signals to lighten the burden on physicians [6].

The diagnosis of AF is based on the recording of subject’s ECG signal, medical history, and clinical evaluation [7, 8]. Thus, the development of ECG which has heart rhythm and physiological information is an important basis and makes automatic detection and diagnosis of heart disease better in real time of AF [911]. Since AF happens unpredictably and due to the high requirement of timely treatment, more and more wearable devices that can analyze ECG signal are used for the real-time diagnosis of these subjects [12]. However, traditional ECG monitoring equipment has many inadequacies. In addition to complicated operation outside, the problem of insufficient consideration for multiscene applications also exists, which is one of the limitations. There are many difficult problems in real-time diagnosis. Wearable ECG monitoring device as a real-time detection device can meet the requirement of collecting ECG signals and preliminary diagnosis by using the human sensor networks. It can also promote the development of telemedicine. Hence, the classification algorithm for AF detection in real time that can be used in wearable ECG monitoring devices is significant.

The existing wearable ECG monitoring device transmits the individual ECG parameter information to the remote server through wireless transmission and combines with various mobile terminals to achieve the goal of completing the ECG monitoring task [13]. The flowchart of the detection of arrhythmia in the wearable device is shown in Figure 1. Feature extraction is a nonnegligible part in the analysis of arrhythmia diseases. There are two main methods of AF detection in terms of feature selection. The first method is to diagnose without waves and f waves [14]. However, this detection method requires signal with high quality, and there are certain difficulties in practical applications. The second method is based on the RR intervals. The distribution of RR intervals during AF differs from the distribution during normal sinus rhythm [15]. The R wave is the most prominent feature of the ECG and easy to locate. In recent years, many scholars have studied the AF detection algorithm based on RR interval features and made great progress [16]. Support vector machine (SVM) and linear discriminant (LD) is a widely used classification method in ECG arrhythmia detection. Lin et al. used the RR interval as a feature input and linear discriminant (LD) to classify ECG heartbeats [17]. Huang et al. proposed the use of random projection with support vector machine (SVM) and RR interval to classify ECG heartbeats [18]. In addition to SVM and LD, various machine learning techniques are also used as classifiers. Swapna et al. applied a series of samples centered on the R peak of the heartbeat, using a convolutional neural network (CNN), recurrent structures such as recurrent neural network (RNN), long short-term memory (LSTM), gated recurrent unit (GRU), and hybrid of CNN and recurrent structures to automatically detect the abnormality [19]. Related to this article, Afdala et al. presented a new feature that is Shannon entropy for RR interval and classifier by ANN for detecting AF [20]. In terms of diagnostic methods, machine learning technology is a new and advanced arrhythmia detection method developed in recent years. As a traditional model, ANN is of great significance for accelerating AF detection [17].

In the research studies, these methods have made some achievements using RR intervals combined with different classifiers in the detection of AF, but they have some defects, such as computational complexity, and their accuracy cannot satisfy the current needs. Hence, a detection method with low calculating complexity, high precision, and stability is required by the wearable ECG monitoring device.

In this paper, our motivation is to build a system with lower computational cost and an excellent classification performance to detect AF in real time. The choice of features and classifiers is the most important part of this system. The feature we selected for this experiment is the sequence of RR intervals of the ECG because the R peak with strong robustness is the most prominent characteristic in ECG. We input the selected feature into ANN using lightweight algorithms for classification, choose the appropriate number of hidden neurons and network parameters, and verify them in AF Termination Challenge Database and MIT-BIH Arrhythmia Database. And we compared the results obtained in the ANN model with other more commonly used models. The results present that the ANN model we proposed is better than other models for the detection of AF. The contributions of this method include the following:(i)We propose an ANN model with lower computational complexity which has much reliable and higher classification accuracy than some of the models previously proposed for AF detection in wearable ECG monitoring devices.(ii)We use the rhythm characteristics, the RR intervals, as input dataset to the network instead of using complex traditional data to achieve the requirements of lower computational complexity. And we achieve the desired accuracy only using this feature.

The remainder of this paper is organized as follows. Section 2 covers the ANN model based on RR interval algorithms for AF detection in detail. Section 3 presents the experimental methods on the database and its results. Finally, the research of this study and the planning of future work are described in Section 4.

2. Material and Methods

2.1. Artificial Neural Network (ANN)

ANN developed by Hecht-Nielsen [21], as a pattern recognition machine learning method, was used in the detection of AF; it mimics brain mechanisms through a series of interconnected nodes, just as the brain has many neuronal connections [22]. The model in this study is a feed-forward neural network. Figure 2 shows a block diagram of a three-layer feed-forward neural network including input layers, hidden layers, and output layers.

In this model, the input value x is the RR interval of the ECG we extracted, and the number of input nodes is the same as the number of input parameters. y is the output vector of the hidden layer, and O indicates that the detection result obtained by the output layer is AF or N. The constant is the proportionality coefficient, which reflects the learning rate during training. Weight is the most important link of ANN simulation neuron association, where represents the weight between the input layer and the hidden layer (the adjustment formula of is shown in equation (1)) and W represents the weight corresponding to the hidden layer and the output signal. The adjustment formula is shown in equation (2). The principle of adjusting the weight is to reduce the error continuously. The offset value b is a factor associated with the stored information. The use of cross-entropy as a performance function facilitates the implementation of this process. So far, there is no defined algorithm for choosing the number of hidden nodes. Most of them can only be obtained through sufficient experience or extensive experiments [23, 24]. We set our ANN classifier by adjusting the parameters of the ANN to optimize the classification outcomes.

The activation function used in the hidden layer node is the sigmoid function. Equation (3) is the calculation process of the activation function f, and equation (4) is the calculation method of the output layer. The output of the network is compared with the probability threshold (the default value is 0.5).

2.2. Data Collection and Processing

The experimental data selected in this paper are from the ECG database in Physionet provided by the Massachusetts Institute of Technology, which is the internationally recognized standard ECG database. The experimental data in this paper were collected from two different datasets. Heartbeat waveform of the AF patient was recorded for 80 minutes in the AF Termination Challenge Database and the normal ECG records were obtained from MIT-BIH Arrhythmia Database. Each record has an annotation file associated with it and normal heartbeats dominate the database. To avoid bias in the experimental results, we choose the same number of RR intervals for AF and N records. We divide the data into training sets, validators, and test sets. The validation set is used to determine whether the model has reached sufficient accuracy for a given training set. Without validation procedures, the model may overfit.

For the deep learning (DL) models, it learn directly from data rather than from previous knowledge, so data processing and analysis are crucial, and functional implementation relies heavily on data. Using the raw ECG signal directly as an input will increase the computational complexity, and it will be affected by the power frequency interference and low-frequency noise, which will also have a great impact on the calculation accuracy. The reason for converting the original ECG record in the database to RR intervals instead of using the raw ECG record directly is that using RR intervals can highlight AF behavior while ensuring reduced computational complexity [12]. Most public ECG datasets have corresponding comments. RR intervals are easy to obtain from R position [17]. We use RR intervals as an input, which is obtained by simple processing of the following equation:where RR(n) is the change between the nth and n + 1th R peak and N is the total number of RR intervals.

2.3. The Performance Measures

In order to estimate the performance of the heart rhythm classification, several performance indicators have been used; they are sensitivity, specificity, and accuracy [25]. They are defined as follows:where TP is true positive, FN is false negative, TN is true negative, and FP is false positive. In general, the best test results are to maximize these four indicators [14].

3. Experiment and Results

3.1. Experiment

This section describes the details in the experiments and results of the network model. Hidden layer neurons in ANN are significant parameters which affect the classification results and greatly affect the performance of the algorithm [25]. In order to find the optimal number of hidden neurons, we put the prepared dataset into the model and tested by sensitivity, specificity, and overall accuracy under the number of hidden layer neurons to achieve better classification results. Through our experiments, we determine the number of hidden layer neurons and set other parameters of the ANN such as learning rate, number of hidden layers, and other parameters as the optimal parameters, as shown in Table 1. Finally, a separate test set was used to verify the classification results of the model. The test set and the training set did not have the same continuous data. The results of this study were analyzed using confusion matrix, error distribution histogram, and ROC curve.

3.2. Results

In this paper, our objective is to use the RR intervals of the ECG signal as the input data to test the classification accuracy in the ANN model. Table 2 shows the accuracy of model output when the input dataset is RR intervals and the number of hidden layer neurons is set as 10, 15, 20, 25, 30, and 35, respectively, under the condition of ANN’s optimal parameters. According to these tables, when the input dataset is RR intervals and the number of hidden layer neurons is set to 25, the ANN model could get the highest accuracy rate of 98.3%, sensitivity of 97.4%, and specificity of 99.3%.

Besides, the performance of ANN can be evaluated by a confusion matrix. The confusion matrix is an intuitive analysis table that summarizes the results predicted by the classification model. The calculation outcomes of the confusion matrix in the training set and the test set are shown in Figure 3. The dark part of confounding matrix is the sample number of correct classification, while the light part is the sample number of the wrong classification. In the confusion matrix of training results, we can obtain that the number of TP is 15956, the number of FP is 409, the number of FN is 114, and the number of TN is 15521. The number of TP in test data is 1934, the number of FP is 56, the number of FN is 18, and the number of TN is 1992.

The ROC Curve is often used as an indicator to measure the degree of fitting a model. It calculates a series of true rates and false-positive rates by setting successive variables to different thresholds and then uses the false-positive rate as the ordinate. The ratio is plotted on the abscissa; the larger the area under the curve, the higher the diagnostic accuracy. In other words, the closer the curve is to the ideal value of the upper left corner, the better the model fits. The ROC curve for this experiment is shown in Figure 4. And according to Figure 5, when the epochs are 178, the classification error tends to be stable.

Figure 6 describes an error distribution histogram, and the horizontal coordinate is the difference value between the target value and the output value. It shows that most of our training data are concentrated in the optimal error that is the zero point, and this indicates a good classification result.

Based on the above discussion, we believe that the accurate results of the AF model based on the RR intervals as the input dataset are better than other methods for the AF classification. Moreover, we can use this model as an appropriate method for detecting AF. Table 3 summarizes the results of other ECG classification methods compared with the ANN model. It illustrates that the accuracy of 97.8% for the proposed RR-based AF classification model is higher than the classification method proposed previously.

4. Conclusion

This article has conducted in-depth research on the ECG classification algorithm. In view of the characteristics of the ECG, our objective is to use the RR intervals of the ECG signal as the input data to test the classification accuracy in the ANN model. Table 3 shows the accuracy of model output when the input dataset is RR intervals and the number of hidden layer neurons is set as 10, 15, 20, 25, and 30, respectively, under the condition of ANN’s optimal parameters. According to this table, when the number of hidden layer neurons is set to 25, the ANN model could get the highest accuracy rate of 98.3%, sensitivity of 99.3%, and specificity of 97.4%.

This computer-aided diagnosis reduces the risk of intervention and helps us make sense of data. This research has some issues that need further consideration and discussion. For instance, all the experiments in this paper are based on simulation software. To improve the practical value of the proposed algorithm, it can be implemented in the actual ECG monitoring system platform. In future work, the selection of network structure and parameters and the construction of features from medical knowledge are worth further research. Monitoring needs to consider more comprehensive features for a higher accuracy rate of classification than before. ANN has great advantages in dealing with fuzzy data and nonlinear data, but it needs to improve in feature extraction. Deep learning and methods for big data can be good alternatives to show great manufacturability in dealing with future work. We plan to increase the number of data for various types of arrhythmia disease classification and consider using other faster and more accurate methods for feature extraction to achieve the classification of different types of arrhythmia diseases.

Data Availability

The data used to support the findings of this study have not been made available because the data also form part of an ongoing study. The original data of the study can be obtained at https://physionet.org/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.