Abstract

Aiming at the problem that the traditional photoplethysmography (PPG) biometric recognition based on sparse representation is not robust to noise and intraclass variations when the sample size is small, we propose a PPG biometric recognition method based on multifeature deep cascaded sparse representation (MFDCSR). The method consists of multifeature signal coding and deep cascaded coding. The function of multifeature signal coding is to extract the shape, wavelet, and principal component analysis features of the PPG signal and to perform sparse representation. Deep cascaded coding is multilayer feature coding. Each layer combines multifeature signal coding with the result of the previous layer as input, and the output of each layer is the input of the next layer. The function of deep cascade coding is to learn the features of the PPG signal, layer by layer, and to output the category distribution vector of the PPG signal in the last layer. Experiments demonstrate that MFDCSR has better recognition performance than current methods for PPG biometric recognition.

1. Introduction

Photoplethysmography (PPG) biometric recognition has attracted the attention of many researchers in the past decade [16]. PPG signals not only have common characteristics to traditional biological features, such as universality, uniqueness, stability, and ease to collect, but also have the following advantages: (1) Liveness detection. PPG signals can only be captured from living individuals. (2) High security. It is very difficult to imitate PPG signals. (3) Universality. PPG signals can be captured from any individual. (4) Small amount of data. PPG signal is one dimensional and is easier to store and process than image signals such as fingerprints and irises.

Although many findings have been published, the research of PPG biometric recognition is still in the laboratory stage, and it is still far from practical application. There is no mature PPG biometric recognition product on the market, and the main reasons are as follows: (1) Noise. PPG signals often contain different kinds of noise due to a variety of factors, including acquisition equipment, body position, and collection environment. During the acquisition process, the noise level of each individual also varies over time. (2) Intraclass variation. Differences in acquisition principles and the environment required by different types of acquisition equipment also lead to considerable variance in the height of the main wave, descending midisthmus amplitude, height of repetition wave, and pulse signal origination point of acquired PPG signals. The PPG signal is nonstationary and susceptible with time. Therefore, there is a very challenging problem for PPG biometric recognition with an intraclass variation.

Sparse representation selects a linear dictionary of original training signals to reconstruct the testing signal and has been used in fields such as noise removal, signal compression, feature extraction, and pattern recognition [79]. Sparse representation can concentrate on the energy of PPG signals in a small number of samples and reconstruct the signal from them, thus effectively removing noise and redundant information. However, sparse representation is a shallow decision model. When a PPG signal contains a large quantity of noise and intraclass variations, the shallow decision model has weak learning ability and poor robustness to PPG biometric recognition. Therefore, how to further improve the robustness of sparse representation in PPG biometric recognition is a problem worthy of study.

Most PPG signal feature extraction methods use single-feature extraction, such as extracting waveform features, global features, and wavelet features. PPG signals are affected by the external environments of noise and intraclass variations, and single features always fail to obtain reliable identification results. Complementary information exists between features and provides more adequate information in the process of PPG biometric recognition. In the multifeature learning process, fusion learning of different heterogeneous features of PPG signals could improve recognition performance. However, the heterogeneity of features also poses obstacles to fusion learning. Therefore, it is worthwhile to study how to establish the connection between PPG features and thus obtain an effective fused feature representation of PPG signals.

To solve the above problems, we integrate sparse representation learning into deep cascade learning and propose a multifeature deep cascaded sparse representation for PPG signal biometrics. A simple illustration of the proposed methodology is shown in Figure 1. The proposed method includes two parts of multifeature signal coding and deep cascade coding. The multifeature signal coding is to extract the shape, wavelet, and principal component analysis features of the PPG signal and to perform sparse representation. Deep cascaded coding is to learn the discriminative features of the PPG signal, layer by layer, and to output the category distribution vector of the PPG signal in the last layer. The main innovations of this work are as follows:(1)The proposed multilayer feature extraction model based on multifeature deep cascaded sparse representation does not require a large training database, and it has good feature representation capabilities(2)The application of multifeature learning to PPG signal identity recognition improves the recognition performance by exploiting the complementarity of features(3)By transforming different base classifiers, the proposed model has good scalability

The rest sections of this work are as follows: first, we give the related work in Section 2. Then, we introduce the proposed methodology in Section 3 and report the experimental results and analysis in Section 4. At last, we give the brief conclusion and some future work in Section 5.

The existing methods of PPG biometric recognition have fiducial and nonfiducial-based methods.

The fiducial methods use the time domain characteristics as fiducial points, such as the amplitude, time interval, and slope of the PPG signal. Chakraborty and Pal [1] used the first and second derivatives to extract 12 feature points of amplitude and time from the PPG signals of 15 healthy individuals and calculated the Euclidean distance of the PPG signal feature statistical parameters for recognition. Lee and Kim [2] took 708 data records from 10 healthy individuals and used the feedforward neural network for PPG biometric recognition. Nadzri et al. [3] extracted the systolic peaks, diastolic peaks, and dicrotic notches of PPG signals, and used the Bayes network, radial basis function, and multilayer perceptron for PPG biometric recognition. Sancho et al. [4] extracted the feature of the time domain and Karhunen–Loève transform, and used matching metrics on four different PPG databases. Nonfiducial approaches take a more holistic approach to extract the overall signal morphology. Spachos et al. [5] proposed a PPG biometric recognition by the linear discriminant analysis (LDA) and K-Nearest Neighbor (KNN). Karimian et al. [6] used discrete wavelet transform (DWT) for PPG biometric recognition. Yadav et al. [10] proposed a method of continuous wavelet transform (CWT) and direct linear discriminant analysis (DLDA) for PPG biometric recognition. Faragó et al. [11] presented the correlation-based nonfiducial feature extraction technique by computing correlations of PPG signals. Lee et al. [12] used the discrete cosine transform (DCT) for PPG biometric recognition.

Recently, many PPG biometric recognition methods based on deep learning have been proposed. Everson et al. [13] proposed a PPG biometric recognition based on a four-layer deep neural network, which included two convolutional neural networks and two long and short-term memory layers. Biswas et al. [14] proposed a novel deep-learning framework that could effectively estimate heart rates and only used wrist-worn single-channel PPG signals collected in a mobile environment for PPG biometric recognition. Hwang and Hatzinakos [15] proposed a PPG recognition method by using a convolutional neural network with long-term short-term memory to construct a personalized data-driven network and modeled the time-series sequence inherent within the PPG signal.

The fiducial and nonfiducial-based approaches are sensitive to external factors, and the recognition results for PPG signals are not always reliable. Although deep learning has good recognition performance, it needs more computing resources, more training time, and too many training parameter adjustments. Moreover, the interpretability and theoretical analysis of deep learning are still not completely clear.

3. Proposed Methodology

The proposed methodology includes the multifeature signal coding and deep cascade coding. We then give a detailed description of the whole procedure.

3.1. Multifeature Signal Coding

Multifeature signal coding contains the multifeature extraction and the sparse residual coding (SRC). We first extract the shape feature, wavelet feature, and principal component analysis (PCA) features of the PPG samples. Then, we obtain the sparse representation coding of each sample.

3.1.1. Multifeature Extraction

(1) Shape Feature. We acquire the fiducial points of PPG signals, and forty features can be extracted [16], which includes the amplitude and time interval of PPG signals, such as the pulse interval, augmentation and alternative augmentation index, systolic peak time, and dicrotic notch time.

(2) Wavelet Feature. The low-frequency component of PPG signals contains the discriminative feature, and the high-frequency component includes the detail feature. The discrete wavelet transform can extract the low and high-frequency components, which can obtain wavelet coefficients with the discriminative and detailed information of PPG signals in time and frequency [17]. We choose Daubechies wavelet Db8 as wavelet bases, which can reduce the noise influence.

(3) PCA Feature. PCA feature is a linear combination of original features, which can reduce the dimension by mapping high-dimensional data space into the low subspace spanned. We can obtain the principal component features of the PPG signal [18]. PCA feature can summarize the most important features and compress the scale of the original PPG signals.

3.1.2. Sparse Residual Coding

We assume that represents the PPG training samples, each class has samples, is the class number, is the dimension value, , and . A testing sample can be represented by a linear combination of training samples aswhere is a coefficient array that has nonzero values of the -th class. It is important to note that the advantage of representing the test sample as a linear combination of training samples has been explored in [79].

Like the work in literature [9], the sparse representation coefficient of sample can been obtained as follows:where is a regularization parameter and is the norm.

We can obtain the coefficient by solving equation (2), and the sparse residual can be obtained aswhere is a function, which can only set the coefficient array as nonzero values of the -th class, .

At last, we obtain the residual representation as

If a testing sample has label , then with label is larger than with other labels, and we have more discrimination information.

In order to facilitate description, we use the following definition to obtain the residual coding of testing sample by training samples :

3.2. Deep Cascade Coding

Deep cascade coding contains multiple levels, and each level is a SRC unit proposed in Section 3.1.2. Each level of deep cascade coding receives the input information consisted of the previous level and multifeature coding, and each level outputs the result that is the input of the next level [19], as shown in Figure 2.

After the multifeature coding, we can get three coding (, , and ) of a query sample and three coding matrices (, and ) of the training samples . Then, the query sample coding and training samples are regarded as the input of the first level coding.

First, is inputted into the first level to obtain the coding representation , which is concatenated with as the input query sample of the second level, , . Similarly, each column in is inputted into the SRC unit to obtain the coding vector set , which is concatenated with as the input training samples of the second level, , .

Second, in a similar way, the coding and of query sample , augmented by the coding obtained by the previous level, are inputted as the query sample of the third and fourth level, respectively. and , augmented by the training samples obtained by the previous level, are inputted as the respective training samples of the third and fourth level, respectively. All of these steps are repeated times, and the final query coding and training coding are obtained.

Finally, and obtained by level are inputted into the sparse representation classification to get the final prediction coding , .

3.3. Recognition

At the recognition stage, we can obtain the label of query sample by the coding as follows:

The total process of MFDCSR is given as Algorithm 1.Input: the PPG training sample sets , query sample , class number , regularization coefficients , level number .Output: the label of the query sample .(1)Initialize: i = 1;(2)Obtain feature coding vectors , and of a query sample and feature coding matrix , and by multi-feature signal coding;(3), ;(4)whiledo(5) Obtain using equation (5);(6) Obtain using equation (5);(7) Update , ;(8)i = i + 1;(9)end while(10)Obtain feature coding vector ;(11)Find the index of maximum value in using equation (6), which shows the label of .

4. Experiments

4.1. Databases

To verify the validity of the proposed MFDCSR, we choose three databases: Beth Israel Deaconess Medical Center (BIDMC) [20, 21], Multiparameter Intelligent Monitoring for Intensive Care (MIMIC) [22], and CapnoBase [23]. The BIDMC contains 53 8-minute long recordings with 125 Hz sampling frequency and was collected from 19 to 90 years old. MIMIC was captured from the patients in ICUs, and the signal recordings have different types. The MIMIC database contains different data of PLETH, ABP, and RESP. PLETH is the PPG data signal with 125 Hz frequency. The CapnoBase database includes PPG, ECG, and other biometric recordings for 42 cases of 8-minute long recordings with 300 Hz frequency. In all experiments, we first take 1-minute long recordings per subject as the experiment data and use 60% of the data for training, 30% for validation, and 10% for testing. The testing samples are disjoint from training samples, and we report the average testing results.

4.2. Performance Metrics

To verify the correctness and feasibility of the proposed MFDCSR, we use the subject recognition rate as detecting criterion, which is defined as follows:where is the total number of test samples, and is the number of correctly identified testing samples.

4.3. Parameter Evaluation

First, we detect the performance influence of the cascade level numbers. The subject recognition rates with different level numbers on three databases are shown in Figure 3.From Figure 3, we can see that the subject recognition rate increases with the growth of the level numbers. On BIDMC, after the level number is more than seven, the subject recognition rate increases slowly; on MIMIC and CapnoBase, the subject recognition rate increases slowly after the level number is more than five. Therefore, for all databases, we set the level number as seven in the proposed MFDCSR. Then, we evaluate the parameters of sparse representations on the performance influence. As suggested in [24], we set regularization coefficient as , where is the sample dimension of PPG signals. The iteration number of sparse representations also affects the recognition performance, and the subject recognition rate with different iteration numbers is shown in Figure 4.

From Figure 4, we can see that the subject recognition rate obtains better results if the iteration number is more than 50.

Finally, to evaluate the performance influence of cycle numbers per sample, the subject recognition rate under different cycle numbers is shown in Table 1.

From Table 1, we can see that the recognition performance increases with the increase of cycles on the three databases. When the number of cycles reaches 1.5, the subject recognition rate is 99.88% on CapnoBase. When the number of cycles varies from 0.5 to 1.5, the recognition performance increases quickly on all the three databases. In our experiment, we set 1.5 cycles per sample, which consumes time of about 1–3 seconds, and it is acceptable for practical application.

4.4. Comparison with Single-Feature MFDCSR Method

MFDCSR uses three features of shape, wavelet, and PCA, and we compare it with only single-feature MFDCSR. The recognition performance on the three databases is shown in Table 2.

As is shown in Table 2, the multifeature MFDCSR has better performance than all MFDCSRs with single-feature on the three databases, which can show that multifeature learning can enhance the recognition performance.

4.5. Robustness to Noise

To evaluate the noise robustness of the proposed MFDCSR, Gaussian noise is added into three databases. We compare the performance robustness of MFDCSR with the methods of shape, wavelet, and PCA. The subject recognition rates with different noise levels are shown in Figure 5.

From Figure 5, we can see that MFDCSR with different noise levels has better recognition performance than the other methods on three databases. MFDCSR can extract more discriminative information by multifeature deep cascaded sparse representation and has more robustness than the other methods.

4.6. Comparison with State-of-the-Art Methods

In this section, we give the comparison of MFDCSR with state-of-the-art methods for PPG biometric recognition on the three databases, and the results are shown in Table 3.

From Table 3, we can see that MFDCSR outperforms other methods on all three databases. For example, the subject recognition rates of our method increase by 0.21% on the MIMIC database, and it is evident that multifeature deep cascaded sparse representation can enhance the performance of the proposed method.

It is worth noting that MFDCSR and our previous work in literature [19] are different. Our previous work only extracted one feature as a feature descriptor, and MFDCSR extracted multiple features of the shape, wavelet, and PCA. Our previous work used multiscale representation to deal with noise, and MFDCSR exploits the complementarity of features to improve the recognition performance. MFDCSR is a multiple feature learning method, and our previous work is a signal feature learning method, so MFDCSR and our previous work in literature [19] are different methods.

4.7. Analysis of Computation Time

It is important to analyze the computation time of the proposed method for PPG biometric recognition, and we give the time cost of the recognition process on the CapnoBase database. We cannot obtain the source codes of the other methods for PPG biometric recognition, so we only give the time cost of our proposed method in Table 4. Our experiments are carried out on an Intel i7-4790 3.60 GHz and 16 GB RAM with MATLAB 2016b.

As shown in Table 4, we can see that the feature-extraction time of MFDCSR is fast. The training time of MFDCSR without gradient backpropagation is acceptable, as the deep learning training is known to take more time.

5. Conclusions

There is growing concern about the study of PPG biometric recognition in recent years. In this paper, we propose a PPG biometric recognition method based on multifeature deep cascaded sparse representation. First, we extract the shape feature, wavelet feature, and PCA feature of PPG signals. Then, we use the SRC to obtain the sparse representation of different features. Second, to mine more discriminative information, we perform a levelwise coding learning without back propagation. At last, experimental results demonstrate that the PPG biometric recognition method based on multifeature deep cascaded sparse representation has better recognition performance. In future work, we aim to change the cascade model to improve the feature extraction technique for PPG biometric recognition.

Data Availability

The simulated data used to support the simulation part of this study are available from the corresponding author upon request, and the real-world PPG data can be obtained from (1) https://www.physionet.org/physiobank/database/bidmc, (2) https://archive.physionet.org/cgi-bin/ATM?database=mimic2db, and (3) https://dataverse.scholarsportal.info/dataverse/capnobase.

Conflicts of Interest

The authors declare that they have no conflicts of interest.