Abstract
Recently, collaborative representation-based classification (CRC) and its many variations have been widely applied for various classification tasks in pattern recognition. To further enhance the pattern discrimination of CRC, in this article we propose a novel extension of CRC, entitled discriminative, competitive, and collaborative representation-based classification (DCCRC). In the proposed DCCRC, the class discrimination information is fully utilized for promoting the true class of each testing sample to dominantly represent the testing sample during collaborative representation. The class discrimination information is well considered in the newly designed discriminative -norm regularization that can decrease the ability of representation from the interclasses of each testing sample. Simultaneously, a competitive -norm regularization is introduced to the DCCRC model with the class discrimination information with the aim of enhancing the competitive ability of representation from the true class of each testing sample. The effectiveness of the proposed DCCRC is explored by extensive experiments on the several public face databases and some real numerical UCI data sets. The experimental results demonstrate that the proposed DCCRC achieves the superior performance over the state-of-the-art representation-based classification methods.
1. Introduction
Nowadays, the linear representation-based classification (RBC) often including sparse representation-based classification (SRC) [1] and collaborative representation-based classification (CRC) [2] has attracted more and more attention in pattern recognition. In both SRC and CRC, each testing sample is linearly represented by all the training samples and always classified by the class-specific representation residuals. Due to the excellent representation-based classification performance, the RBC methods have been widely used in many classification tasks, such as image classification [3–10] and face recognition [11–19].
It has been well known that SRC with the -norm regularization of representation coefficients is a very promising kind of RBC owing to its good property of sparsity and natural discrimination [1, 20, 21]. However, it has been argued that the representation-based pattern discrimination originated from the -norm collaborative representation of all the training samples instead of the -norm sparse representation of a few training samples, and then the standard CRC was first proposed as a general extension of SRC [2]. Specifically, using the -norm regularization of representation coefficients, the effective discrimination benefits from the collaborative representation from all the class-specific training samples. Because of the efficient closed-form solution of CRC for the effective classification performance, a great many CRC extensions have been developed in recent years [6, 15, 17–19, 22–35]. Moreover, the possible reasons of the natural discrimination from CRC were detailedly analyzed from the perspective of class separability of data [18] and the probability [22]. Among the CRC methods, the general extensions are the weighted CRC using the localities of data as the weights that constrained the collaborative representation coefficients [17, 27–29, 31]. Since collaborative representation has the efficient and effective classification performance, several two-phase collaborative representation-based classification methods have been designed in [30–33, 36]. Moreover, such two-phase collaborative representation-based classification also has the property of sparsity for enhancing the ability of pattern discrimination [30]. Using the superiorities of sparse representation and collaborative representation, the extensions of combining both were proposed for classification in [34, 35, 37, 38]. Besides, due to good latent discrimination contained in the representation, sparse representation and collaborative representation were utilized to design the effective nearest neighbor classification [39–41].
In many latest extensions of CRC, the class discrimination information of data in fact was fully employed for strengthening the power of the pattern classification [42–47]. From the point of view of probability, a probabilistic CRC (ProCRC) was developed by using the discriminative regularization of the representations between all the classes and each class [22]. Using the prior information of data the extended ProCRC (EProCRC) was proposed in [43], and using the coarse to fine representation the two-phased ProCRC was proposed in [33]. Through designing the discriminative regularization of pairs of the representations of any two classes, the new discriminative sparse representation method for classification (DSRC) was proposed in [44]. On the basis of DSRC and ProCRC, a novel discriminative CRC method was proposed to extend DSRC [45]. To overcome the issue that the phases of representation and classification in the most CRC variations are not integrated into a unified model, a collaborative and competitive representation-based classifier (CCRC) was proposed in [46]. CCRC directly includes the classification decision in its model and can enhance the training sample from each class to competitively represent each testing sample. With the aim of obtaining the similar competitive representations among all the classes, the discriminative -norm regularization of the representations of all the classes except any one class was designed for proposing the competitive and collaborative representation classification method (Co-CRC) [47]. As argued in these discriminative CRC extensions above, the discriminative representation was achieved for favorable classification.
Based on the fact that the discrimination information of data can be explored for enhancing the power of pattern discrimination in collaborative representation, in this article we proposed a novel discriminative competitive and collaborative representation-based classification method (DCCRC) by using the discriminative representation among all the classes. The proposed DCCRC assumes that each class can discriminatively and competitively represent the testing samples. The discriminative and competitive collaborative representations among all the classes can be realized by two -norm regularizations in the DCCRC model. One is the newly designed -norm regularization of the pairs of representation from all the classes and representations from all the classes excluding any one class. The other is the competitive -norm regularization of representations from all the classes excluding any one class [47]. To experimentally verify the classification performance of the proposed DCCRC, we compare it to the state-of-the-art RBC methods on several face databases and some real numerical UCI data sets. The conducted experiments show that the proposed method is effective with better classification results than the competing RBC methods. In summary, our main contributions in this article are given as follows:(1)A new discriminative -norm regularization is designed by using the representations from all the classes excluding any one class(2)A novel discriminative, competitive, and collaborative representation is proposed for classification by considering the discrimination information of data(3)The experimental analyses are reported for well demonstrating the effectiveness of the proposed DCCRC
The rest of this article is organized as follows. Section 2 briefly describes the related work. Section 3 detailedly presents the proposed DCCRC and then analyzes it. Section 4 reports extensive experiments to evaluate the effectiveness of the proposed DCCRC. Finally, the conclusions of this article are given in Section 5.
2. The Related Work
In this section, we briefly review some related RBC models. First of all, some commonly used notations are denoted here. We suppose that the set of all the training samples from C classes is denoted as , where d is the dimensionality of the feature space and N and are the numbers of all the training samples from all the classes and class i, respectively. Note that the ith column vector of X represents the training sample and the subset of the training samples from class i is . Besides, we also assume is a given testing sample used for classification. In the liner representation-based classification, the testing sample y is approximately represented as , where is the vector of all the representation coefficients corresponding to all the training samples of X and is the subvector of the representation coefficients from class i.
2.1. CRC
CRC is a typical linear representation-based classifier proposed recently [2]. In the CRC, a given testing sample y is collaboratively represented by all the training samples for classification. The CRC model is defined aswhere λ is a positive regularization parameter. Clearly, CRC can learn the closed-form solution of S aswhere with an identity matrix I. Using the learned , the class-specific representation residuals are determined as . Finally, the given testing sample y is classified into the class with the minimum representation residual among all the classes.
2.2. DSRC
DSRC [44] is a discriminative sparse representation method with a -norm regularization of the pairs of any two class-specific representations. It can achieve the good pattern discrimination among the different classes with sparsity. The DSRC model is defined aswhere γ is a positive regularization parameter. Through some algebra operations, the efficient solution of S can be obtained aswhere and . Using the learned , we compute the class-specific representation residuals with and classify y into the class with the minimum representation residual among all the classes.
2.3. Co-CRC
Co-CRC [47] is a new extension of CRC that can induce each training class to discriminatively and competitively represent each testing sample. The Co-CRC model is defined aswhere β is a positive regularization parameter. The second term in equation (5) is the competitive representation constraint. According to the way of solving S [47], the learned solution of S is achieved aswhere and . Using , the class-specific representation residuals are calculated as and the testing sample y is classified into the class with the minimum representation residual among all the classes.
3. The Proposed DCCRC
In this section, we detailedly present the proposed DCCRC method. The basic idea of DCCRC is first given, and then the DCCRC model and its solving procedure are described, finally the essential properties that DCCRC holds are analyzed.
3.1. Idea of DCCRC
The proposed DCCRC contains two assumptions that are originally inspired by the competitive and collaborative representation [47]. For clear descriptions, the collaborative representation of the given testing sample y using all training samples is rewritten as , where represents the training samples excluding samples from class i with the corresponding vector of the representation coefficients. The one assumption originates from the expectation that the true class of given testing sample y can dominantly represent y and the other classes have little contribution to representing it (i.e., ). Unfortunately, the true class of the testing sample y is not known and any one of all the training classes could be chosen as the true class of y. In fact, we only make the training samples from one class to competitively represent the testing sample y as soon as possible and the contribution to representing y from other classes is as poor as possible in ideal case. Accordingly, with this good expectation, the testing sample y is well represented as from class i by simultaneously minimizing the representation from other classes. Thus, in the proposed method we introduce the competitive constraint that was first designed in [47].
In collaborative representation, all the training samples approximately represent the testing sample y as soon as possible, i.e., . During the process of representation, if y belongs to class i with dominant representation , the representation from the other classes tends to be very small. In specific, tends to be equivalent to the representation in some degree. In such an ideal case, the approximate equalities can be learned, . Borrowing the idea of degrading the correlations among classes by minimizing the discriminative constraint [44], we also assume that the correlation between the representation from class i and the representation from the other classes is as small as possible. That is to say, if class i can dominantly represent y with and all the training samples can well represent y with , the correlation between and should be small. Similar to the definition of [44], we design the another new discriminative constraint . It is obvious that to minimize can minimize , , and . Minimizing satisfies the first assumption. If , minimizing approximately equals to minimize that can well degrade the correlation between the representation from one class and the representation from the other classes.
3.2. Model of DCCRC
In this section, we first introduce the objective function of the proposed DCCRC model and then present the procedures of solving it in details. The given testing sample y is represented by collaborative representation of all the training samples, and the DCCRC model on the basis of its idea is defined as follows:where is the positive regularization parameters. In equation (7), the second term , first designed in [47], is the competitive constraint that can make each class competitively and discriminatively represent the testing sample y among all the classes. The third term is the discriminative constraint that not only makes each class competitively represent the testing sample y but also degrades the representation correlations between one class and the other classes for more discrimination. Note that when , DCCRC is the same as CRC, and when , DCCRC is the same as Co-CRC.
In order to achieve the solution of the representation coefficient vector S, equation (7) should be further reformulated aswhere and . To simply solve S, let and . Firstly, the derivative of with respect to S is calculated as
Since can be rewritten aswhere , and G is defined as
Using equations (9) and (10), is reformulated as
Then, the derivative of with respect to S is calculated as
In equation (13), using , can be reformulated as
Using equations (10) and (14), equation (13) can be finally rewritten as
Clearly, the objective function of DCCRC is . Using equations (10) and (15), the derivative of the proposed function with respect to S is
Finally, we set , and the solution of the representation coefficient vector S in equation (7) is obtained as
After obtaining the representation coefficient vector S, we calculate the class-specific representation residuals and determine the class label of the testing sample y as
That is to say, the given testing sample y is classified into the class with the minimal representation residuals among all the classes. According to the descriptions of the proposed DCCRC model above, the proposed DCCRC is briefly summarized in Algorithm 1.
|
3.3. Analysis of DCCRC
In this section, we first further analyze the terms and in the proposed DCCRC method, in order to explain the more power of pattern discrimination. And then, the analyses of differences between the proposed DCCRC and Co-CRC, DSRC are emphasized.
Using the way of analyzing the competitive representation [47], in term can be rewritten as , and we can obtain the equality as
Assume the angle between and is α. Using equation (19), can be obtained as
According to equation (20), when , approaches with the same direction. In this ideal case, the given testing sample y is dominantly represented by from class i that y truly belongs to. Thus, to minimize could have two advantages. One is that each class competitively represents the testing sample y. Another one is that the true class of y could competitively represent it and the other classes poorly represent it.
Moreover, in term can be reformulated as . Through simple algebra of , we can also achieve equation (20). This fact means that to minimize has the very similar superiorities of minimizing . And also, we can rewrite as . To minimize is to simultaneously minimize , , and . We can see that except minimizing , minimizing can degrade the correlation between and [44]. That is to say, to minimize could degrade the correlation between one class and the other classes, in order to enhance the power of pattern discrimination and competitive representations among all the classes. Thus, the terms and can obtain the competitive and discriminative collaborative representation for favorable classification. Besides, the pattern discrimination among all the classes can be intuitively verified in the next section.
The differences between the proposed DCCRC and Co-CRC, DSRC can be analyzed by comparing their corresponding models (i.e., equation (7) for DCCRC, equation (5) for Co-CRC, and equation (3) for DSRC). According to equations (3) and (7), DCCRC is very different from DSRC, but both have similar discriminative terms. The term in DSRC can degrade the correlations between any two classes for favorable pattern discrimination, but the term in DCCRC can degrade the correlations between any one class and the other classes for competitively enhancing the discriminative representation from each class for classification. Besides, compared to DSRC, the proposed DCCRC also has the competitive constraint and the regularization of the representation coefficients. Furthermore, the proposed DCCRC is the extension of Co-CRC because DCCRC and Co-CRC have the same competitive constraint . In contrast with Co-CRC, the proposed DCCRC also has the designed discriminative constraint and the regularization of the representation coefficients, in order that DCCRC further enhances the competitive representations among all the classes. Thus, the proposed DCCRC has more pattern discrimination than DSRC and Co-CRC that can be experimentally verified in the next experimental section.
4. Experiments
In this section, the extensive experiments on several face databases and some real numerical UCI data sets are conducted. In the experiments, we compare the proposed DCCRC with the state-of-the-art RBC methods including SRC [1], CRC [2], CCRC [46], Co-CRC [47], DSRC [44], ProCRC [22], and EProCRC [43]. It should be noted that all regularized parameters in the competing methods are preset as the range for fair comparisons in the experiments. The optimal classification results of each competing method are obtained among the range of its parameters.
4.1. Data Sets
In this section, we briefly describe the used data sets including the AR, YaleB, IMM, Yale, and PIE29 face databases and the real UCI data sets. The YaleB database (http://vision.ucsd.edu/leekc/ExtYaleDatabase/ExtYaleB.html) was taken under different poses and uncontrolled illumination conditions. The Yale database (http://cvc.yale.edu/projects/yalefaces/yalefaces.html) was taken by different facial expressions. The AR database (http://www2.ece.ohio-state.edu/aleix/ARdatabase.html) was taken by various facial expressions and illumination conditions, and we use a subset of AR with 1400 image from 100 subjects. The IMM database (http://www.imm.dtu.dk/∼aam/datasets/datasets.html) contains 240 annotated monocular images from 40 subjects. The PIE29 database (http://www.intbox.com/public/project/4742/) was taken in different conditions including 13 postures, 43 lights, and 4 expressions. In the experiments, each image is cropped and resized into with 256 gray levels per pixel and also the gray level values are normalized to [0, 1]. The numbers of total samples, classes, samples per class, and chosen training samples per class are shown in Table 1. As an example, the image samples of one subject from each face data base are shown in Figure 1.
(a)
(b)
(c)
(d)
(e)
The real used eight UCI data sets were downloaded from UC Irvine Machine Learning Repository (UCI) (http://archive.ics.uci.edu/ml). They are “Wine,” “Vehicle,” “Auto MPG,” “Statlog (Heart),” “Statlog (Australian Credit Approval),” “Credit Approval,” “Isolet,” and “Ionosphere.” Note that “Auto MPG,” “Statlog (Heart),” “Statlog (Australian Credit Approval),” “Credit Approval,” and “Ionosphere” are abbreviated as “Auto,” “Heart,” “SCredit,” “Credit,” and “Iono,” respectively. The numbers of total samples, classes, attributes, and training samples per class are displayed in Table 2. In the experiments, each sample on these UCI data sets is also normalized to . Furthermore, on these face and UCI data sets, they are randomly divided into the sets of the training and testing samples ten times, and the training samples chosen from each class are shown in Tables 1 and 2.
4.2. Experiment 1
In this section, we first conduct the experiments to analyze the competitive term and the discriminative term by varying the values of the parameters and in the proposed DCCRC on the five face databases. The values of the parameters , , and are preset as , and the numbers of training samples per class are chosen as on AR, on IMM, on YaleB, on Yale, and on PIE29. For visual comparisons, the model without is denoted as DCCRC1, and the model without is denoted as DCCRC2. Accordingly, we compare DCCRC1 with DCCRC to demonstrate the discrimination of the term by varying the values of the parameter . And we compare DCCRC2 with DCCRC to demonstrate the discrimination of the term by varying the values of the parameter . It should be noted that the values of the parameters and are optimal with best classification accuracies when DCCRC1 is compared with DCCRC, and the values of the parameters and are optimal with best classification accuracies when DCCRC2 is compared with DCCRC. For conveniently presenting the values of and in the figures, we use and (i.e., the values of and correspond to that of and , respectively).
The classification accuracies of DCCRC1 and DCCRC with varying are shown in Figure 2, and the ones of DCCRC2 and DCCRC with varying are shown in Figure 3. From the experimental results in Figure 2, we can see that DCCRC with significantly performs better than DCCRC1 without , and DCCRC is more robust to the variations of than DCCRC1. As shown in Figure 3, we can also observe that DCCRC with significantly performs better than DCCRC2 without and DCCRC is more robust to the variations of than DCCRC2. In addition, the classification performance of DCCRC1 with variations of and DCCRC2 with variations of shows that the terms and can improve the power of the pattern discrimination. The experimental results in two figures imply that the proposed DCCRC has effective and robust classification performance. As a consequence, the more pattern discrimination of the proposed DCCRC originated from the competitive and discriminative terms is well verified.
(a)
(b)
(c)
(d)
(e)
(a)
(b)
(c)
(d)
(e)
And then, we visually verify the discriminative ability of the proposed DCCRC method in comparison with the competitive CRC method (i.e., Co-CRC). As discussed in Section 3.3, we define the class-specific representation contribution for the given testing sample y as
Clearly, both DCCRC and Co-CRC classify each testing sample into the class with the largest among all the classes. Then, the pattern discrimination ability of both is intuitively represented by the representation reconstructive images for the given testing samples from class 26 in IMM and class 9 in AR. The first five representation reconstructive images of the testing samples corresponding to the top five largest representation contributions are illustrated in Figure 4. Note that the numbers in the bracket under each reconstructive image are the class and its representation contribution . For example, under the reconstructive image means the class 26 has the representation contribution . As can be seen in Figure 4, the proposed DCCRC correctly represents and classifies the testing samples, but Co-CRC wrongly represents and classifies them. Moreover, we can observe that the first reconstructive image that is reconstructed by the class with the largest representation contribution in Co-CRC is very similar to the testing image on each face database. Through the experimental illustrations in Figure 4, the proposed DCCRC is more discriminative than Co-CRC for classification. This means the designed term is discriminative. Therefore, it can be concluded that the proposed DCCRC has the effective and robust classification due to the competitive and discriminative constraints.
(a)
(b)
4.3. Experiment 2
In this section, we compare the proposed DCCRC to the competing methods on the face databases and the UCI data sets. The experimental results of each competing method are the averages of the classification accuracies on ten division of each data set. The best classification accuracies of each method are achieved among the range of its parameter, and the preset class-specific training samples on each data set are shown in Tables 1 and 2.
The classification accuracies of all the competing methods are shown in Table 3 on face databases and Table 4 on the UCI data sets. Note that the best classification performance among all the methods on each data set is indicated in bold face. As shown in two tables, the classification accuracies of each competing method almost ascend with the increase of the class-specific training samples on all the data sets. On the face databases, we can see that the proposed DCCRC nearly achieves the best classification accuracies among all the competing methods, but it could not significantly improve very much in comparison with some methods. As displayed in Table 4, the proposed DCCRC significantly performs better than the other competing methods. In addition, from the classification results in two tables, we can observe that CCRC, Co-CRC, DSRC, ProCRC, and EProCRC obtain the similar competitive classification performance. The possible reason is that these methods can fully employ the class-specific representations in the collaborative representation to improve the pattern discrimination among all the classes. As a consequence, we can conclude that our DCCRC method is a promising representation-based classifier in pattern classification with effectiveness and robustness.
4.4. Experiment 3
In this section, we conduct the experiments on IMM and Yale to compare the proposed DCCRC with the competing methods under the situations of the testing samples with the corruptions. In the experiments, the numbers l of the class-specific training samples are preset as on IMM and on Yale, and the remaining samples per class are regarded as the testing samples. And the classification results of each competing method are the averages of recognition accuracies on ten training and testing divisions of data. Moreover, the testing samples per class are randomly corrupted by randomly adding the pixels and the block occlusion with a panda. That is to say, the pixel corruptions are that some pixels of each testing image are randomly replaced by the uncertain gray scale values between 0 and 255, and some part of each testing image is randomly occluded by the panda. The ratios of the corrupted size to the original size of each testing image are from 0.1 to 0.4 with a step 0.1. As an example, the testing samples with the random pixels from one class are shown in Figure 5 and with the random occlusions shown in Figure 6.
(a)
(b)
(a)
(b)
The classification results of the competing methods on IMM and Yale with varying ratios of the random corruptions are shown in Table 5 for random occlusions and in Table 6 for random corrected pixels. Note that the best classification performance among all the methods in two tables is indicated in bold face. As listed in two tables, the classification accuracies of each competing method descend with the increase of the ratios of the corrupted size of each testing image. From these experimental results, we can see that the proposed DCCRC is nearly the most robust among all the competing methods, since it outperforms the other competing methods. Thus, the proposed method has more effectiveness and robustness under the situations of data with noises.
5. Conclusions
Collaborative representation-based classification is a typical technique for pattern recognition. To further improve pattern discrimination in collaborative representation, we design a new discriminative, competitive, and collaborative representation-based classification method (DCCRC) in this article. The proposed DCCRC extends the Co-CRC method and mainly designs a discriminative regularization of the collaborative representation from all the classes and the ones from all the classes excluding any one class. The proposed method fully utilizes the class-specific representations in collaborative representation and can competitively and discriminatively enhance the class-specific representations for good classification. The extensive experiments on some face databases and UCI data sets are conducted for verifying the effectiveness and robustness of the proposed DCCRC method. Through comparing DCCRC with the state-of-the-art representation-based classification methods, the proposed DCCRC outperforms the competing methods. Thus, the proposed DCCRC is an effective and robust classifier in pattern recognition. In the future work, we will employ the idea of the competitive and collaborative representation among all the classes into the other kinds of classifiers.
Data Availability
The UCI and face data used in our article to support the findings of this study have been deposited in their corresponding public repository. The authors have given the websites, from which the used data can be downloaded.
Conflicts of Interest
All the authors declare that there are no conflicts of interest regarding the publication of this article.
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (Grant nos. 61976107, 61962010, and 61502208), Natural Science Foundation of Jiangsu Province of China (Grant nos. BK20150522 and BK20170558), International Postdoctoral Exchange Fellowship Program of China Postdoctoral Council (no. 20180051), Research Foundation for Talented Scholars of Jiangsu University (Grant no. 14JDG037), China Postdoctoral Science Foundation (Grant no. 2015M570411), Open Foundation of Artificial Intelligence Key Laboratory of Sichuan Province (Grant no. 2017RYJ04), Natural Science Foundation of Guizhou Province (nos. [2017]1130 and [2017]5726-32), Natural Science Foundation of Ningxia of China (no. 2019AAC03122), and Key Science and Research Project of North Minzu University (no. 2019KJ43).