Abstract
Long-term monitoring of resting tremor is key to assess the status of patients suffering from Parkinson’s disease (PD), which is of vital importance for reasonable medication. The detection and quantification of resting tremor in reported works rely heavily on specified movements and are not appropriate for long-term monitoring in real-life condition. The purpose of this study is to develop a detection model for long-term monitoring of resting tremor and explore an effective indicator for tremor quantification. This study included long-term acceleration data from PD patients and proposed a resting tremor detection model based on machine learning classifiers and Synthetic Minority Oversampling Technique (SMOTE). Four machine learning classifiers, K-Nearest Neighbor (KNN), Random Forest (RF), Adaptive Boosting (AdaBoost), and Support Vector Machine (SVM), were compared. Furthermore, an indicator called tremor timing ratio (TTR) was defined and calculated for tremor quantification. The detection model with RF classifier achieved the highest overall accuracy of 94.81%. The sample entropy of the acceleration signal was proved most influential in the classification by exploring the feature importance. Through the Kruskal-Wallis test and the Mann-Whitney U test, the TTR had a strong correlation with the subscore of resting tremor in Unified Parkinson Disease Rating Scale (UPDRS). Such two-step evaluation process for resting tremor can detect the tremor effectively and is expected to be applied in long-term monitoring of PD patients in daily life to realize a more comprehensive assessment of PD.
1. Introduction
Parkinson’s disease (PD) is a progressive, adult-onset neurological disease and gets more common with age, affecting more than doubled number of sufferers over the past generation [1]. Therapy for Parkinson’s disease primarily focuses on ameliorating the symptoms with medication [2]. Detection and quantification of these symptoms are of vital importance for reasonable medication. Resting tremor is one of the most typical presenting signs of PD [3] and about 75% of patients have symptoms of resting tremor [4]. It is an involuntary shakiness most noticeable in the hands, presents at rest, and disappears with intentional movement. The most popular tool to assess the severity of rest tremor in clinic is rating scale and the Unified Parkinson Disease Rating Scale (UPDRS) is widely accepted [5]. However, the motor examination in UPDRS is based on subjective observation of the clinicians trying to capture a snapshot of the outpatients. It is acknowledged that there is variance among the judgements from different clinicians, because of short time observation and the diversity in experience. Thus, reliable, objective, and accessible diagnosing methods for resting tremor are urgently expected.
Related works have reported various computer-assisted techniques in the diagnosis and assessment of PD. As the mature voice acquisition equipment facilitates the construction of voice-based data, most study on PD focuses on dealing with speech processing [6]. Besides, approaches based on handwritten spiral images drawn by PD patients are also hotspots in recent research [7, 8]. Both voice-based method and drawing images require that the patients have no cognitive disorder and fail to work in case of the low compliance of patients. Local field potentials (LFP) were used to detect resting tremor in recent works [9, 10]. LFP requires the electrodes to be implanted to the patients during the deep brain stimulation (DBS) and directly deals with brain signal. However, it is difficult to popularize the LFP recordings for the surgical DBS treatment is invasive and costly. Other external body noninvasive signals such as electromyography (EMG) [11] and electroencephalography (EEG) [12] also have been applied to study resting tremor. Both EMG and EEG need fixed electrodes which are cumbersome to patients and not appropriate to long-term monitoring in daily life. In recent years, Leap Motion Controller (LMC), an interactive and noncontract device mainly used for hand gestures and finger position detection, has been introduced to the quantification of hand tremor [13]. Nevertheless, all the methods mentioned above are unsuited for long-term monitoring.
Smartphone and smart watch with motion sensors appeared in recent studies on monitoring the symptoms of PD [14]. A few researchers have employed inertial sensors [15, 16] to assess the symptoms of resting tremor. Lacy et al. [17] presented noninvasive electromagnetic sensors to conduct the finger tapping test in PD patients. However, most of the reported studies designed experiments in which the patients were required to do the assigned tasks to induce tremor and record signal for just a few seconds or a few minutes. Although these tasks make the resting tremor more distinctive and make the assessment more efficiently, the assigned tasks are different from the activities of daily living. In the patients’ real life, tremor does not occur all the time and the amplitude of the tremor is always changing with time. It is challenging to collect the dataset in real-life condition and to distinguish the resting tremor from the activities of daily living.
In addition, multiple methods such as joint time-frequency analysis, statistical analysis, and machine learning have been employed to study the tremor so far. Spectral analysis and time-frequency analysis were well utilized in several previous studies [18, 19]. Salarian et al. [20] used frequency and amplitude of the signals to detect and quantify resting tremor through fixed thresholding. Manzanera et al. [21] compared several parametric and nonparametric spectral estimation methods for tremor detection but failed to detect the resting tremor accurately. Moreover, wavelet analysis such as discrete wavelet transform [22] and wavelet coherence analysis [23] provided an approach to analyze tremor signals and a framework called WAKE was proposed by using wavelet decomposition coupled with adaptive Kalman filtering to extract hand tremor [24]. Besides, statistical analyses such as correlation analysis [25] and coherence analysis [26] were also helpful methods for exploring the features corresponding to tremor signals. Deep learning techniques were well applied in the diagnosis of diseases. Zeng et al. proposed a SDPSO-SVM model for diagnosis of Alzheimer’s disease [27] and promoted the diagnosis algorithm with deep belief network-based multitask learning [28]. With advances of deep learning technique, a few researchers attempted to employ neural networks to diagnose or quantify tremor [29, 30]. Yohanandan et al. [31] compared the performance of linear regression models with machine learning classifiers and achieved tremor severity ratings on the Bain-Findley tremor rating scale. Most of the reported works so far focused on detecting and quantifying resting tremor based on fragments of signals. However, a long-term continuous automatic monitoring system for PD in the home-based environment is of great significance, which can help the assessment of the tremor severity.
Wearable devices are suitable for tracking the resting tremor in daily life and enable the automatic detection and quantification of resting tremor, making drug adjustment for patients more reasonable and timelier. In this work, a two-step evaluation process for resting tremor of PD patients based on long-term acceleration data is proposed. Consecutive long-term data of PD patients was collected in home settings and provides statistical information such as tremor prevalence. The tremor detection model with good generalization performance combines Synthetic Minority Oversampling Technique (SMOTE) with machine learning classifiers, providing a judgement on the status of the hand tremor. A new indicator called tremor timing ratio (TTR) is extracted for the tremor quantification based on the consecutive tremor detection and proved to be highly correlated with tremor severity.
2. Materials and Methods
2.1. Data Acquisition
To acquire the long-term acceleration data, a total of 20 patients suffering from PD were invited to participate in the experiment. Before data acquisition, explicit informed consent was obtained. The experiment was carried out in accordance with The Code of Ethics of the World Medical Association with prior approval of the Ethics Committee of Shenzhen Second People’s Hospital. The patients were asked to attach the wearable devices with the triaxial accelerometer to both their wrists (see Figure 1). The wearable devices acted like watches and did not interference with the activities of subjects at all. The patients wore the wrist accelerometers consecutively and were allowed to move freely following their own will, same as their daily activities. The acceleration data was sampled at a frequency of 100 Hz and transmitted via Bluetooth wirelessly. The data receiver was exactly fixed in the room where the patients were. In the meantime, the whole process was video filmed using cameras. Several neurologists with rich clinical experience were invited to evaluate the tremor of patients according to UPDRS (see Table 1). Because the symptom of the resting tremor might change over time, the patients’ scores were repeatedly examined in different periods of the experiment to be applied as the gold standard. The hand status of the patients was further annotated in detail to facilitate the data labelling with assistance of the synchronous video and the acceleration data was then labelled as tremor and nontremor accordingly. Since not all the patients suffered from resting tremor and the resting tremor disappeared after some tremor-dominant patients took the medicine, the mean tremor prevalence was 16.9%. Detailed information of patients and experiment duration is shown in Table 2.

(a)

(b)
2.2. Framework of Tremor Detection
The aim of tremor detection was to track the hand status of the patients and to detect whether the resting tremor occurs according to the long-term acceleration data. To develop the resting tremor detection model, all the acceleration data collected was involved. The framework of the detection algorithm is shown in Figure 2, including three key parts: preprocessing, feature extraction, and classification. To meet the requirements of real-time and detection resolution, a 5-second time window was adopted, sliding on the acceleration signals with an incremental step of 1 second. The acceleration data was detected in each time window. Because the distance between the patients and the data receiver was changing as the patients moved and might be too far away from the data receiver, a small amount of data was missing. The sliding window firstly checked whether the acceleration data was missing according to timestamps. If there was no data missing in current time window, the tremor detection would proceed. Otherwise, the data in current window would be skipped and the detection would slide to the next window.

2.3. Preprocessing
Before feature extraction, preprocessing of raw acceleration data was indispensable. The acceleration signal in the given time window was firstly bandpass filtered with zero-phase digital filter using Kaiser window, respectively. The offset of the triaxial acceleration signals was dependent on the position of the accelerometer and changed with the movements of the hands. Thus, components in extremely low frequency were to be eliminated. Besides, both intentional movement and involuntary resting tremor produced acceleration signals of relatively low frequency and were expected to be preserved after filtering. In this case the passband of the filter was limited to the range from 0.65 to 12.5 Hz. Since the frequency of the resting tremor mainly ranges from 4 to 7 Hz [32], the waveform of resting tremor signal was well preserved after filtering. The comparison of triaxial acceleration signal before filtering and after filtering is shown in Figure 3.

(a)

(b)
2.4. Feature Extraction
After preprocessing, features which characterized the resting tremor were extracted. Related works on the characteristics of tremor have been reported, in which amplitude and frequency parameters were proved to be significantly different between the PD patients and healthy controls [33]. A number of characteristics in previous works offer multiple choices for feature extraction. Considering that excess features contain redundant information which may result in bad performance and slow down the detection speed of the model, five features in time domain and frequency domain were finally calculated and were presented in Table 3. Details and definition of the selected features were then clarified below.
In time domain, the feature characterizing the amplitude of signal was extracted. Root-mean-square values of the triaxial acceleration signals were firstly calculated as the total acceleration. The mean amplitude (mAmp) of the acceleration signal was computed based on the total acceleration. To calculate the mean amplitude of the acceleration signal, the envelope of the total acceleration was extracted. The mean amplitude of the signal in a time window is defined as the average of the difference between the upper envelope and the lower envelope, given bywhere envupper(n) is the upper envelope and envlower(n) is the lower envelope. An instance of the envelope extraction from the acceleration signal is illustrated in Figure 4.

Sample entropy (SampEn) was first introduced by Richman and Moorman [34]. It can be used to characterize the complexity of time series by measuring the probability of a new pattern being generated in the signal. Sample entropy of hand tremor was proved significantly different between the PD patients and healthy controls [35]. In this study, the sample entropy of the total acceleration signal was computed to distinguish the resting tremor status. For the total acceleration signal in a window of N data points, the SampEn is calculated by the following steps:(1)The acceleration signal is first reconstructed to the delay vector: where .(2)The distance between Xm(i) and Xm(j) is defined as(3)Ai is the number of the vector Xm(i) such that (4)The SampEn is finally obtained by
Fast Fourier Transform (FFT) was applied to the triaxial acceleration signals, respectively, to further obtain the features in frequency domain. The spectrums of the triaxial acceleration signals are obtained bywhere accx(n), accy(n), and accz(n) are the triaxial acceleration signals.
The dominant frequency (DF) was estimated by searching the maximum value of the peaks in the frequency spectrums. Comparing the amplitude of the peaks in triaxial frequency spectrums, the frequency with highest amplitude was defined as the DF. The DF of the triaxial acceleration signals is estimated bywhere fs is the sampling rate. The DF estimation from a window of acceleration signal labelled as resting tremor is depicted in Figure 5.

The power spectral density (PSD) of triaxial acceleration signals was estimated. The energy (E) of the signal was calculated as the integral sum of the triaxial spectrums, written aswhere Px(f), Py(f), and Pz(f) are the PSD of the triaxial acceleration signals.
The spectrum distribution characterized the frequency components of the signal. The power around the DF to the total power described how the spectrum concentrated around the DF. Since the resting tremor was most obvious when the patients were at rest, the spectrum was supposed to be highly concentrated around the DF when resting tremor was present. In the frequency spectrum, the amplitude of the highest peak corresponding to the DF of resting tremor was much higher than other peaks. The ratio of the power in the range of 0.4 Hz around the DF to the total power was named as spectrum concentration (SC) and can be calculated as follows:
It was summarized that each window of 5 s acceleration signal was characterized by five features. The dataset consisted of a total number of 225,066 samples of five features. Since not all the patients suffered from resting tremor and the tremor may disappear after some tremor-dominant patients took the medicine, the samples of resting tremor only accounted for 16.9%.
2.5. Classification
Before classification, the SMOTE algorithm was employed to balance the dataset. As the tremor prevalence was only 16.9%, the unbalanced dataset was challenging for the classification. Undersampling and oversampling are two strategies of resampling to solve the unbalanced dataset problem. Random undersampling may reduce the necessary information contained in the dataset and the random oversampling adopts the strategy of simply copying samples to increase the minority class of samples, easy to cause the model overfitting. Therefore, SMOTE, which was an algorithm improved based on the random oversampling algorithm, was employed to deal with the unbalanced dataset [36]. The basic idea of SMOTE is to analyze and simulate the samples of minority class and add the artificial simulated new samples to the dataset. After oversampling with SMOTE algorithm, the proportion of resting tremor samples increased to 50%.
To construct the detection model, four classifiers of machine learning having good classification performance [37, 38] and meeting the real-time requirements were opted to achieve the tremor detection, including K-Nearest Neighbor (KNN), Random Forest (RF), Adaptive Boosting (AdaBoost), and Support Vector Machine (SVM).
K-Nearest Neighbor was proposed by Cover and Hart [39], which was commonly used as a supervised learning method. KNN classifier finds the k training samples closest to the test sample in the training set according to the given distance metric and predicts the results based on the information of the K-Nearest Neighbors. The selection of the value of k has a great influence on KNN learning model. Therefore, the grid-search method was utilized to find the optimal parameters. Due to the large quantity of the samples, only 10% of the samples were used to search for the optimal parameters.
Random Forest classifier combines a number of decision trees with randomly selected features into a forest. Each decision tree determines the class of the sample independently and the forest chooses the majority of the determination from the decision trees [40]. The RF classifier operates efficiently on large dataset and reserves high accuracy despite the large scale of data missing. Because of the random sampling, the training model has small variance and strong generalization capability.
Adaptive Boosting classifier was first proposed by Schapire [41], which is an ensemble aggregation classifier as well as Random Forest. The core idea is to train different weak classifiers for the same training set and then combine these weak classifiers to make it a stronger one. The adaptation is embodied in the strategy that the weights of the samples misclassified by the previous weak classifiers will be strengthened. Then the updated samples are to be used to train the next weak classifier again. In this work, the AdaBoost classifier was implemented with the decision tree as the weak learner.
Support Vector Machine model finds hyperplanes which maximize the distance between the closest samples in different classes, hence the name Maximum Margin classifier [42]. SVM addresses the classification problem with small quantity of samples, nonlinear and high dimension data. Similarly, the grid-search method was applied to explore the optimal parameters for the SVM classifier.
2.6. Method of Tremor Quantification
To further evaluate the severity of tremor status, 8 patients with valid tremor data over a long period of time were selected in this experiment. The recorded resting tremor scores ranged from 0 to 3. A total of 33 cases with assigned resting tremor scores were collected, consisting of 9 cases scored as 1, 18 cases scored as 2, and 6 cases scored as 3. Each case was a 15-minute continuous recording of acceleration signal. The detail is shown in Table 4.
By exploiting the time-duration characteristics of different tremor scores, a new indicator was designed and calculated to quantify tremor severity. Tremor timing ratio (TTR) was defined as the ratio of the detected tremor samples (tremor duration) to all the samples (the total duration of tremor samples and nontremor samples) in consecutive fifteen minutes, which can be calculated as follows:
The TTR was then used to perform statistical analysis with the rating scores for resting tremor. The Mann-Whitney U test and the Kruskal-Wallis test were adopted. The Mann-Whitney U test or the Wilcoxon rank sum test is a nonparametric test for two populations when samples are independent [43]. The Kruskal-Wallis test is a nonparametric version of classical one-way Analysis of Variance and an extension of the Wilcoxon rank sum test to more than two groups [44]. The Kruskal-Wallis test is valid for data that has two or more groups. It compares the medians of the groups of data to determine whether the samples come from the same population or the populations having the same distribution.
3. Results
3.1. Performance Metrics
In this work, multiple machine learning methods were selected for resting tremor detection. To better assess the performance of the selected methods, the fivefold cross-validation was introduced to the assessment. The dataset was divided into fivefold randomly. In each iteration, onefold of the dataset was kept as the holdout set for testing while the remaining folds were used as the training set. The average accuracy of fivefold cross-validation was defined as the overall accuracy. The performance metrics that were employed to compare those methods include sensitivity, specificity, overall accuracy, and F1 score, which can be expressed as follows:where TP, TN, FP, and FN represent the number of true positive, true negative, false positive, and false negative, respectively.
As a measure of consistency, Kappa coefficient was also included as the performance metrics to measure the effectiveness of classification. For the classification problem, the Kappa coefficient reflects a more authentic evaluation than accuracy because the Kappa coefficient makes punishment for the unbalanced dataset. In this tremor detection model, the Kappa coefficient can be calculated as follows:
3.2. Result of Tremor Detection
In the classification of tremor status and nontremor status, a total of 373,894 samples were used in the fivefold cross-validation. The testing results for the selected four classifiers were listed in Table 5. The RF classifier outperformed other classifiers in all aspects and achieved an overall accuracy of 94.81%. The Kappa coefficient of the RF classifier reached 0.90 which implied that the classification results were almost perfectly consistent with the actual labels. The goal of the tremor detection was to distinguish the resting tremor from other hand status of patients. The high sensitivity of RF classifier means the model is able to recognize the tremor status effectively, which is also conductive to the early screens for tremor.
An instance for tremor detection model applied in one of the patients is shown in Figure 6. The model combining with RF classifier was tested on a segment of 60 s acceleration data from both hands. Compared with the gold standard, the error of the detection results was within 2 seconds, which demonstrates the good capability of the model to detect the resting tremor.

(a)

(b)
3.3. Result of Tremor Quantification
As shown in Table 6, the returned value of for the Kruskal-Wallis test was 0.0034, which indicated that the Kruskal-Wallis test rejected the null hypothesis that all three data samples came from the same distribution at a 1% significance level. The boxplot (Figure 7) visually presents the summary statistics for each score. The values of for the Mann-Whitney U test on score 1 and score 2, score 1 and score 3, and score 2 and score 3 were 0.022, 0.0048, and 0.026, respectively, which further verified that the TTR of each score were significantly different from each other. The test results demonstrate that the TTR shows a strong correlation with the subscore of resting tremor in UPDRS and has a prospect of applying in the quantification of resting tremor.

4. Discussion
4.1. Main Findings
This work aims to establish a two-step evaluation process for resting tremor of PD patients using long-term acceleration data. Previous work on tremor detection and quantification is mainly based on short segments of signal and heavily relied on manual signal segmentation [37]. The data cannot represent the activities of daily living and the methods are not suitable for continuous tremor detection. In this work, the long-term monitoring of PD patients mainly lies in two aspects. Firstly, the detection model distinguishes the tremor status from other daily movements. Secondly, based on the consecutive acceleration recordings, TTR is extracted as the tremor quantification indicator. Statistical information on acceleration data over a continuous period is used to quantify the patient’s changing status under continuous tracking. As the results show, the detection model identifies the resting tremor effectively with an accuracy of 94.81%. In comparison, Zhang et al. [45] developed both generic model and person-specific model using SVM classifier and achieved best accuracy of 80.7% and 85.9%. The detection model in this work got a satisfactory result compared to previous achievements. In addition, by testing the algorithm on a Dell XPS 8930 featuring an Intel(R) Core™ i7-8700 CPU @ 3.20 GHz with 8.00 GB RAM, an averaged runtime of the feature extraction is 0.068 seconds per data segment of 5 seconds. The runtime for model training of RF classifier and classification is 20.12 s for all the data. In terms of quantification, TTR assesses the tremor burden by the frequency of resting tremor occurrence, and it is shown to be significantly different between tremor scores. Thus, the proposed two-step evaluation process could detect the resting tremor timely and provide an automatic long-term observation of patients, offering a powerful support to the diagnosis and treatment of PD for clinicians cannot stay with the patient all the time. It is feasible to monitor the resting tremor of PD in daily life, to track the transition of the patients’ status, and to realize the automatic quantification of resting tremor.
4.2. Influence of Preprocessing
In data preprocessing, the acceleration signal in the given time window was firstly bandpass filtered. The goal of the bandpass filtering was to remove noise such as gravity artifacts in low frequency. To verify the influence of preprocessing on the overall performance, the detection models were retrained using signals without preprocessing and the result is shown in Table 7. In the absence of preprocessing, the result of classification was significantly decreased compared to that of acceleration signals with preprocessing and the accuracy of RF classifier was 89.38% and reduced by 5.43%. The preprocessing is indispensable to perform before feature extraction.
4.3. Predictor Importance for Features
Among all the classifiers, RF classifier outperformed the others and showed balanced performance for different metrics. Therefore, the predictor importance of the features engaged in classification was estimated through RF classifier. The feature with high importance estimated through RF classifier contains the vital information which differentiates the resting tremor from other daily hand movements of PD patients. The estimated predictor importance was measured by permutation. The out-of-bag permuted predictor importance estimates of the five features in RF classifier are shown in Figure 8. The SampEn gains the largest value of predictor importance followed by the energy of the signal, which means the SampEn is most influential in identifying tremor. The result hints that the complexity of acceleration data measured by SampEn is sensitive to the characteristic of resting tremor in PD patients. Other features also contributed to the tremor detection algorithm but were less influential compared to SampEn. The difference in feature importance reflects the ability of different features to discriminate between resting tremor and other daily activities. The information contained in the other features is less able to discriminate between tremor and other activities than SampEn and E.

4.4. Tremor Detection under Interpatient Paradigm
In previous studies, two popular evaluation paradigms for analysis of signals in clinical medicine and the biological sciences are intrapatient paradigm and interpatient paradigm. In the intrapatient paradigm, the data in training set and testing set may come from the same patient. In the interpatient paradigm, the data from the same patient never appears in the training set and testing set at the same time. But in practice, the tremor detection model might come across unknown individuals, which requires the model with excellent generalization performance. To test the generalization performance of the resting tremor detection model, the acceleration data collected from 20 patients was again divided into training set and testing set according to different patients and the result is listed in Table 8. The detection model with RF classifier performs well for most of the patients. The weighted average accuracy according to the number of the samples for each patient was calculated as the average accuracy and the model still obtained a high average accuracy of 89.70%, slightly lower than the result of intrapatient paradigm, which proves the good generalization ability of the algorithm.
4.5. Limitation
The main limitation of this work is the data size. Though the number of the subjects is limited, the acceleration data was collected for long enough duration and samples of each subject are adequate. It is really a challenge to collect the long-term acceleration data of PD patients in real-life condition. The dataset needs to be extended to further verify the findings. Secondly, only three kinds of resting tremor severity were quantified, lacking the score 4. It is because there were very few patients scored as 4 for resting tremor when they were on medication, and the patients diagnosed as 4 were rather severe and unable to participate in the experiment. Therefore, the difference of TTR between the cases scored as 4 and other scores was not validated. Further research is expected by collecting more cases scored as 4.
5. Conclusions
This work developed a two-step evaluation process for resting tremor of PD patients. The proposed tremor detection model in this work can distinguish the resting tremor effectively with good generalization performance. TTR for tremor quantification was proposed and showed significant difference in different scores, which was proved valid in quantifying the severity of the resting tremor. It is notable that the tremor detection model was based on the long-term acceleration data collected in a real-life condition, which made the model feasible to practical application. The proposed automatic two-step method is expected to be applied for real-time monitoring of patients and help to guide the medication for PD in daily life.
Data Availability
The processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors’ Contributions
Han Yuan and Sen Liu have contributed equally to this work.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant No. 61071004), Shanghai Municipal Special Project of Industry Transformation and Upgrading (GYQJ-2020-1-31), Shanghai Municipal Science and Technology Major Project (Grant No. 2017SHZDZX01), The Project of Shanghai Engineering Research Center (15DZ2251700), The Basic Research Projects (Subject Arrangement) of the Shenzhen Science and Technology Program (JCYJ20180228162928828), and The Guangdong Basic and Applied Basic Research Foundation (2019A1515111106).