Abstract
In this paper, we propose a novel robust algorithm for image recovery via affine transformations, the weighted nuclear, , and the norms. The new method considers the spatial weight matrix to account the correlated samples in the data, the norm to tackle the dilemma of extreme values in the high-dimensional images, and the norm newly added to alleviate the potential effects of outliers and heavy sparse noises, enabling the new approach to be more resilient to outliers and large variations in the high-dimensional images in signal processing. The determination of the parameters is involved, and the affine transformations are cast as a convex optimization problem. To mitigate the computational complexity, alternating iteratively reweighted direction method of multipliers (ADMM) method is utilized to derive a new set of recursive equations to update the optimization variables and the affine transformations iteratively in a round-robin manner. The new algorithm is superior to the state-of-the-art works in terms of accuracy on various public databases.
1. Introduction
Robust methods have been successfully applied to numerous computer vision tasks, including face recognition [1], signal processing, scene categorization [2], point cloud segmentation using image processing [3, 4], and object detection [5]. Image representation, mainly the face recovery and alignment, has been an important research topic and can be found in applications in a variety of areas such as surveillance systems, sparse coding, image denoising, communications, computational imaging, and computer vision [6–12]. However, analyzing visual data is a difficult task due to miscellaneous adverse effects such as illuminations, outliers, and sparse noises. It is thus of importance developing a new approach for image alignment and recovery via a convex optimization, which are resilient to various annoying effects.
Since the inception of the pioneering work of robust principal component analysis (RPCA) by Candes et al. [13], a myriad of algorithms has been addressed for robust sparse-low-rank image recovery, e.g., [14, 15]. However, these methods do not work well when the outliers and heavy sparse noises are heavily skewed. By assuming the dictionary images are registered, Wagneret al. [16] parameterize the misalignment of the test image with an affine transformation. These parameters are optimized using generalized Gauss–Newton methods after linearizing the affine transformation constraints. By minimizing the sparse registration error iteratively and sequentially for each class, their framework is able to deal with a Lagrange of variations in translation, scaling, rotation, and even 3D pose variations. Due to the adoption of holistic features, sparse coding is more robust and less likely to overfit. In [7, 17], a novel algorithm through using sparsity priors for image processing was addressed. To overcome this drawback, Oh et al. [18] considered a new partial singular value thresholding (PSVT) algorithm, which replaced the nuclear norm in RPCA [13] with the partial sum of singular values to improve the recovery of the low-rank part. Lu et al. [19] proposed a tensor robust principal component (T-RPCA) algorithm to find the clean tuber low-rank component. However, T-RPCA is not scalable and robust when the number of tensors becomes large. To tackle the potential effects of outliers and heavy sparse noises and impulse noises, there are several algorithms proposed via different norms, for instance, norm by weakly convex optimization by [20, 21], norm by [22], and then novel matrix completion technique without a prior rank information by [23–26], which proposed a novel algorithm for image recovery via pruning out the potential impact of outliers and heavy sparse noises. However, the proposed methods need to be improved to become more faithful for image recovery in high-dimensional images particularly in signal processing.
This paper proposes a new robust algorithm via affine transformation, the and norms, and spatial weight matrix to reduce the potential impacts of outliers and noises in image and signal processing. To be more resilient to various adverse annoying effects such as occlusions and outliers, the new approach takes the advantages of the novel ideas’ affine transformations, and norms, for more faithful low-rank sparse image representation. Consequently, the distorted or misaligned images can be rectified by affine transformations to render more accurate robust sparse coding for image representation outcomes. The overall problem is first cast as a convex optimization programming, in which the affine transformations, low-rank sparse coding, and subspace recovery are carried out simultaneously. Additionally, the weighted nuclear norm and the norm are also taken into account to prune out the potential impacts of outliers and extreme values from the datasets. Afterward, the iterative reweighted alternating direction method (ADMM) approach is employed and a new set of equations is established to update the optimization variables and affine transformations iteratively in a round-robin manner. Simulation results which were conducted reveal that the proposed approach excels the state-of-the-art works for face recovery on some public datasets. The major contributions of this paper include(1)The affine transformations involved is used to correct and align distorted or misaligned images so that the proposed method is becoming popular.(2)The iterative reweighted nuclear norm model along with norm and the spatial weight matrix is combined to find the true underlying images, as such tackle the potential impacts of outliers and heavy sparse noises in signal processing.(3)The newly developed method take the potential effects of outliers and heavy sparse noises into account to further propose via the iteratively reweighted ADMM approach to solve the convex optimization problem, and a new set of updating equations is developed to iteratively update the optimization parameters and affine transformations.(4)In the new proposed method, the and weighted nuclear norms are incorporated including the spatial weight matrix instead of the norm to prune out the potential impacts of noises in the signal processing. Integrating the and weighted nuclear norms into the low-rank representation to enhance the quality of image recoveries will suppress the effect of noisy data. As a result, our proposed model can be used for image recovery and alignment simultaneously.(5)We conduct experiments on several benchmark datasets, and the experimental results demonstrate the effectiveness of our new method.
This paper is structured as follows. Section 2 describes the formulation of the new problem. Section 3 illustrates the new set of updating equations to solve the formulated convex optimization problem and experimental simulation results are provided in Section 4 to justify the effectiveness of the proposed method. Section 5 draws some concluding remarks to summarize the paper. The summary indicating the comparison of the proposed method with other related approaches is summarised in Table 1.
2. Problem Formulation
Consider images, , where and denote the weight and height of the images, respectively. All of these images contain the same objects and are highly correlated with each other. In many scenarios, these images are corrupted by occlusions and outliers. We can stack these images into a matrix: , where denotes the vector stacking operator. We can decompose into a summation of a low-rank component and a sparse error matrix [41, 42]: , where is a clean low-rank and denotes a sparse error matrix incurred by outliers or corruptions.
In practice, are generally not well aligned, entailing the above low-rank sparse decomposition to be imprecise. To take account of this, inspired by [39, 43, 44], we apply affine transformations to the potentially misaligned input images to get a collection of transformed images , where the operator indicates the transformation. We can then stack these aligned images into a matrix and obtain . The aligned images can be treated as samples taken from a union of low-dimensional subspaces, which, if well aligned, should exhibit a low-rank subspace structure as the rank of the transformed images is as small as possible, up to some outliers and heavy sparse errors. Solving for the variables corresponding to the constraints is intractable due to its nonlinearity. To resolve this dilemma, we assume that the change produced by these affine transformations is small and an initial of is known. We can then linearize by using the first-order Taylor approximation as , where is the transformed image, with being the number of variables, denotes the Jacobian of the image with respect to , and is the standard basis for . In this way, we obtain approximate transformations to recover the low-rank component and sparse noises from high-dimensional images.
To make the new approach more resilient to outliers and heavy sparse noises, the norms, which combines the advantages of the and norms, are used to manifest the sparsity and the low-rank properties. It can also tackle the sparse errors in data points which are highly correlated across all data points in the images. In [8, 38], the joint dictionary learning methods are used but the issue of the affine transformation is not considered, [45, 46], while used an image transformation without considering the and weighted nuclear norms. Moreover, the regularizer is considered as the rotational invariant of the norm and handles the collineraity between features, which is preferred to overcome the difficulty of robustness to outliers [47, 48]. In an effort to overcome inherent shortcoming of the nuclear norm, that is, the equal penalization of each singular value regardless of its magnitude [13], a weighted nuclear norm is based on similar premises to those of the weighted version of the norm [49] and has been proven to provide significant merits in terms of data recovery performance. Our objective is to recover the low-rank component and sparse components exactly solving a convex program whose objective is combination of the weighted nuclear norm the norms. By incorporating the weighted nuclear and norms along with a set of affine transformations and through further considering the spatial weight matrix into account also boosts the performance of algorithm tackling the potential impacts of outliers, noises, and heavy sparse noises between images also, the new method can thus be posted as a convex optimization problem withwhere denotes the weighted version of the nuclear norm of , in which indicates the singular values of , denotes the regularization parameters, and denotes the norm of , and and are used to balance the importance of the two types of low-rank priors. Analytically shown in [49], the weighted nuclear norm is convex. An interesting case arises when a reweighted version of this is adopted by defining the weights as follows:where is a small constant. It should be noted that, by setting , the weighted nuclear norm becomes concave penalizing more heavily smaller values and less the larger ones, and the first weighted nuclear norm term in (1) imposes the low-rank component lying in the low-dimensional subspaces. The fourth term regularizes the error to be sparse.
3. Proposed Algorithm
To solve the convex optimization problem in (1), we consider the augmented Lagrangian function given bywhere is the Lagrangian multipliers, and are the penalty parameters, and . Equation (3) is convex as it depends on the nonnegative matrix factorization. By using augmented Lagrange multiplier with adaptive penalty [50], equation (2) can be rewritten as
Solving (2) directly is computationally prohibitive; thereby, we consider to iteratively update the variables alternatively via alternating iteratively reweighted direction method of multipliers (ADMM) method [51]. In this section, we present a geometric robust subspace algorithm to minimize the recovering errors as defined in equation (1). It is well known that the robust subspace geometric algorithm assisted by the weighted nuclear norm affine transformation, and the norms boost the performance of the proposed method.
Firstly, to update , we fix and , so can be determined bywhere is the iteration index. By ignoring all irrelevant terms of , equation (5) can be simplified as
We can then use the linear augmented direction method with the soft shrinkage operator in [41] and update by (6).
Secondly, to update , we keep and as constants, so can be determined by
Again, by ignoring all irrelevant terms of , equation (7) can be simplified as
By using lemma [52], the update of the column of and is given bywhere denotes the Euclidean norm and .
Lastly, to get an update of , we keep and as constants, and can be determined by
By ignoring all irrelevant terms of , we can obtain
Solving (11) with the threshold operators [22, 43], we can get an update of aswhere denotes the Moore–Penrose pseudoinverse of [53].
Following the same steps as the above, the Lagrangian multiplier is updated by
Likewise, the regularization parameters is updated, respectively, bywhere is a properly chosen constant and is a tunable parameter adjusting the convergence of the proposed method. These updating equations proceed in a round-robin manner until convergence.
4. Experimental Simulations
In this section, we evaluate the effectiveness of proposed algorithm on handwritten digits’ datasets including the MNIST [54], Dummy Face Images [55], and Algore Video Face images [43]. In this work, novel ideas affine transformation, the spatial weight matrix, the weighted nuclear norm, and the norms are taken into consideration to boost the performance of the proposed method. Similar to [39, 43], parameters in our experiments are chosen heuristically. Different datasets are taken into account to examine the effectiveness of the proposed method as compared to the baselines’ RASL [43] and NQLSD [22]. To further see the performance, the numerical simulations, using the peak signal-noise ratio, are considered. As shown in Table 2, the PSNR is very high for all datasets. To further check the image similarity quantitatively to describe the performance of our algorithm using the statistical measures of similarity, mainly the peak signal-to-noise ratio (PSNR) [56],where both the original image and the recovered image are of size .
4.1. Handwritten Images
In this experiment, different datasets are considered to examine the effectiveness of the proposed method. First, 30 handwritten digits of the size are taken from the MNIST database [54]. We compare the PSNR performance of the proposed method with the aforementioned five baselines, as shown in Table 2, from which we can see that NQLSD has better performance than RASL, as NQLSD employs the local linear approximation with a quadratic penalty approach to tackle the potential setback of outliers and sparse noises in the images. The proposed method is superior to the other two baselines, as it considers an iterative linearization via affine transformation, weighted nuclear norm, and spatial weight matrix and considers the norms. As an illustration, some visual images of the recovered handwritten digits based on the aforementioned methods are shown in Figure 1(d), from which we can see that the proposed method recovers the handwritten images better as compared to the two baselines. We can also observe from Table 2 that the proposed approach provides the best performance.

(a)

(b)

(c)

(d)
4.2. Dummy Face Images
To further check the effectiveness of the proposed method, we consider dummy face images, from which the proposed method is more clear than the other state of the art of the works (see Figure 2(d)). The proposed method is more clear than the other state of the art of the works. To further justify the effectiveness of the proposed method, we computed the PSNRs of the dummy face images where we observe that the proposed method is more boosted than the other two baselines (Table 2). This result is resembled with the results illustrated via visualization.

(a)

(b)

(c)

(d)
4.3. Al Gore Video Face Images
Finally, we conduct an experiment on a more complicated face images from videos sequences taken from the Al Gore talking [43]. From these datasets and video sequences, 7 different video face images with the size are taken into account, where the simulation results are illustrated in Figure 3(d), from which the performance of the proposed method is visually clearer than the two baselines. The comparison of PSNR using the proposed method along with two baselines is given in Table 2, from which we can see that the NQLSD yields better performance than the RASL. This is because NQLSD combines the penalized and further decomposes the errors than the RASL. We can further note from Table 2 that the proposed method still outperforms all baselines, as it further considers the norms, weighted spatial matrix, and the affine transformations. This is because affine transformation corrects the distorted images, while the norms prune out the potential impacts of extreme values and the dilemma of spatial dependency between images tackled via the spatial weight matrix.

(a)

(b)

(c)

(d)
4.4. Natural Face Images
Next, we conduct simulations on more challenging images taken from the Labeled Natural Faces database. In this experiment, 7 natural face images with the size are considered. We compare the proposed method with the aforementioned two baselines in terms of PSNR for image recovery. The comparison results are given in Table 2, from which we can see that the NQLSD outperforms RASL, as NQLSD considers the penalized parameters than RASL. Again, an illustration, some recovered natural face images based on the proposed method and aforementioned baselines are given in Figure 4, where the recovered natural face images are depicted in Figure 4(d). The recovered images by the aforementioned algorithms are shown in Figure 4(d), from which we can see that the visual quality of the proposed method is better than all of the baselines. This is in line with the numerical results in Table 2. The performance of the new model is boosted because we incorporated more novel ideas such as the affine transformations, the weighted nuclear norm, and the norm incorporating the issue of the spatial weight matrix to cast the extreme values.

(a)

(b)

(c)

(d)
4.5. Complicated Windows
To further examine the effectiveness of the proposed algorithm, we considered complicated windows, as shown in Figure 5(d), from which we can recognize that the proposed method is more clear than the other state of the art of the works. To justify the effectiveness of the proposed method, numerical experimental simulations justify the performance improvement, as given in Table 2, which justifies that the proposed method is more better than the state of the art of the works. This is because the new method is more resilient with outliers and heavy sparse noises as it is assisted with novel ideas such as the affine transformations, the weighted nuclear norm, and the norm incorporating the issue of the spatial weight matrix to cast the extreme values.

(a)

(b)

(c)

(d)
5. Conclusion
In this paper, a new algorithm is proposed to prune out the potential impacts of gross errors from the corrupted images via affine transformations, the weighted nuclear norm , the norms, and the spatial weight matrix. Considering all mentioned novel ideas are useful to get a trustful method in the areas of high-dimensional images particularly in signal processing, the optimal parameters corresponding to affine transformations and other potential optimizing parameters involved in a new proposed convex optimization problem are found. The ADMM approach is then employed and a new set of equations is established to alternatively update the optimization variables and the affine transformations. Conducted simulations show that the new method outperforms the state-of-the-art methods in terms of accuracy on five public databases.
Data Availability
The data used to support the findings of the study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This work was supported by National Key Research and Development Program of China under Grant no. 2018YFB1305700 and Scientific and Technological Program of Quanzhou City under Grant no. 2019CT009. The authors also acknowledge Addis Ababa University, Ethiopia, and the National Science Fund of Young Scholars (Grant no. 61806186), State Key Laboratory of Robotics and System (HIT) (Grant no. SKLRS-2019-kf-15), and the program “Fujian Intelligent Logistics Industry Technology Research Institute (Grant no. 2018H2001) for their contribution in providing materials that also helped in the research article.