Abstract
To a large extent, classical boosting denoising algorithms can improve denoising performance. However, these algorithms can only work well when the denoisers are linear. In this paper, we propose a boosting algorithm that can be used for a nonlinear denoiser. We further implement the proposed algorithm into a shrinkage curve learning denoising algorithm, which is a nonlinear denoiser. Concurrently, the convergence of the proposed algorithm is proved. Experimental results indicate that the proposed algorithm is effective and the dependence of the shrinkage curve learning denoising algorithm on training samples has improved. In addition, the proposed algorithm can achieve better performance in terms of visual quality and peak signal-to-noise ratio (PSNR).
1. Introduction
Image denoising is one of the basic steps of many advanced image processing technologies. The purpose of image denoising is to obtain the approximate result of a noise-free image from the observed image , so as to minimize the influence of noise on advanced image processing problems such as computer vision, pattern recognition. In order to achieve this purpose, many scholars have developed a large number of excellent denoising algorithms such as wavelet soft threshold [1], NLM [2], K-SVD [3], and BM3D [4] by using wavelet multi-scale decomposition, spatial domain filtering, frequency domain filtering, sparse redundancy representation, and image patch processing technology, among others [5–8]. It is generally believed that the performance of these traditional single denoising algorithms almost reaches the top of the ceiling [9].
Can traditional denoising performance be further improved? Several scholars have answered this question and proposed several boosting denoising algorithms, such as twining [10], Bregman iterations [11], and SAIF [12]. For example, the main idea of twining technology is to continuously extract the noisefree image components from a residual image in an iterative process, and then add these components back into the estimated image. In the Bregman iterations method, the sum of the residual images obtained in the previous iteration is considered and added to the observed image before performing the denoising operation. SAIF [12] is a patch-based boosting denoising algorithm. For each patch, both signal leftovers that reside in the residual image and noise leftovers that reside in the denoised image are considered, and then the algorithm automatically chooses the boosting mechanism.
Unlike the above boosting methods, instead of adding the residual image back into the noisy image, or filtering the previous estimate over and over again, Romano et al. [9] proposed a new boosting denoising algorithm named SOS that strengthens the noise image by adding part of the noisy image to the noise image before executing the next iteration denoiser and subtracting the same part of the outcome; that iswhere the operator represents the denoising algorithm that depends on the parameter and is the kth-iteration denoised image. Further research shows that the SOS algorithm is convergent when satisfies the linearity limitation.
Inspired by SOS, Fang et al. [13] proposed an adaptive boosting denoising algorithm named ABD that plugs in the noise level estimation, which can further improve the performance of SOS, but ABD is still limited by the assumption of linearity.
Although Milanfar [14] pointed out that some denoisers based on patch processing can satisfy the assumption of linearity through Sinkhorn’s algorithm [15, 16], and the NLM, KSVD, and BM3D can be boosted by the SOS algorithm, many denoisers, such as threshold denoisers [17, 18], shrinkage curve learning [19], denoisers based on deep learning [20], and others [21, 22], cannot satisfy the condition of linearity; therefore, they cannot be boosted by the existing boosting algorithm.
In fact, based on the sparse domain model of an image, Hel-Or et al. solved the shrinkage curve in a given training set. Specifically, the shrinkage curve must add noise with a given noise level to the clean image in the training set. Then, the model of a noise image shrinking to a clean image is established in the transform domain, and the shrinking coefficient is solved to obtain the shrinking operator. Finally, these shrinkage operators can be used to remove the noise, which is the noisy image with the same noise level. Hel-Or’s algorithm overcomes the shortcomings of traditional thresholding algorithms in the entire image-processing process by averaging the local patches of the noise image. However, the performance of the algorithm depends heavily on the images in the given training set. If the structure of the image in the training set is similar to that of the noise image, the algorithm can achieve a better denoising effect [23]. This conclusion is discussed and compared with the experimental results in [19, 23]. Therefore, the denoising algorithm cannot satisfy linearity.
Moreover, the noise level also has an effect on the shrinkage curve learning. When the selected shrinkage curve matches the noise level of the noise image, over-or under-shrinkage can be avoided.
On the basis of previous studies and analysis, in this paper, we focus on a boosting algorithm for nonlinear denoisers, search for methods for boosting the shrinkage curve learning denoising algorithm and discuss the convergence condition of the boosting algorithm. The main contributions to this paper are the following:(1)To solve the problem that the performance of a shrinkage curve learning denoising algorithm depends on the training samples and noise level, a boosting algorithm based on shrinkage curve learning denoising is proposed. In the proposed algorithm, the training samples come from the noise image itself, and the threshold of the shrinkage curve is automatically calculated by the noise level estimation algorithm. The algorithm proposed in this paper not only does not depend on the external image but can also be used to devise a boosting denoising algorithm with a nonlinear property.(2)Theoretical analysis shows that the proposed boosting algorithm is convergent when the noise level has a finite growth property, and the effectiveness of the algorithm is verified by experiments.
The rest of this paper is organized as follows: in Section 2, the principle of denoising based on shrinkage curve learning is briefly introduced. In Section 3, the boosting algorithm based on shrinkage curve learning and the basic flow of the algorithm are given, and the convergence of the algorithm is analyzed and proved. In Section 4, the experimental results of the algorithm are presented and analyzed. In Section 5, conclusions and future work are given.
2. Previous Related Work
2.1. Patch-Based Shrinkage Method
Shrinkage curve learning is a denoising algorithm based on image patch processing. In such an algorithm, the noisy image is divided into patches of , and the pixels in these patches are organized by column vectors. According to Elad’s amputation, these column vectors satisfy the sparse domain model [23]; that is, there exists a transformation , for each vector , it can be presented , and , where denotes the number of nonzero entries of .
When the sparse domain model is used to denoise an image, a natural choice is to process each patch individually and then splice them together; in that way, artificial marks may be formed on the edge. To avoid artificial marks on the edge when splicing between patches, it is necessary to average the results of overlapping patches. In practical calculations, the transformation is given, such as the 2D DCT unitary matrix or the 2D DWT matrix.
The following steps generally comprise the entire image-denoising process.(1)Each pixel in is regarded as the center of a patch of size; the patch is denoted as .(2)Using the threshold algorithm, the following steps are performed for each patch for denoising.(i)calculates .(ii)the vector obtained from the pair of , using the preset threshold for threshold operation, can be obtained.(iii)calculates , obtains the denoising result of each patch.(3)The fusion result is obtained by averaging all the denoised patches.
2.2. Shrinkage Curve Learning
The main process of the algorithm presented in the preceding subsection can be expressed by the following formula:where is a shrinkage operator. Hel-Or et al. [19] solved as follows:
Gaussian noise with zero mean value and a given variance of is added to the images in the training set. A clean image and its corresponding noisy image are used to construct training samples . Let and denote clean patches and their noisy version, respectively. We use pairs of patches, , to determine the shrink rule, which is used to replace the hard-threshold algorithm and obtain the optimal threshold parameter set.
To this end, the penalty function is set as follows:
From (3), the penalty function is a series of shrink operators. The purpose of the approach of Hel-Or et al. is to use these shrink operators to shrink the noisy image of the above patches to obtain the closest result to the clean image patches.
For this reason, it is supposed that the shrink operators for each element in the input vector are different, and then the shrink processing for any vector isLetting , and inserting it into (3), the penalty function is then
Because is a unitary matrix, then , and
Equation (6) implies that shrinkage curves can be obtained independently. According to Hel-Or’s method, the following polynomial contraction was used:where is the coefficient of a polynomial, and the least-squares method was used to calculate the optimal solution of the shrinkage curve .
Therefore, inserting (7) into (5) obtains
Letting , it contains all of the sequences, , that is
Similarly, defines a patch diagonal matrix that contains patches
The size of each patch is , and the value is
Using the above denotation, the minimum penalty function is expressed as
Let , that is
The solution of can be obtained from (13) as
which is also the optimal parameter of the shrinkage curve. Therefore, a family of shrinkage curves in the training set are obtained, which can be used for new images incurring interference from the additive noise with the same noise level of .
3. Boosting Algorithm for Shrinkage Curve Learning
Different from the linear denoiser used in the boosting algorithm in [9, 13], a denoiser composed using a shrinkage curve family is nonlinear and is related to the training samples. To determine how to denoise with and boost this kind of denoiser requires some changes in the form of the denoising and denoising algorithm in [9, 13].
3.1. Proposed Algorithm
In the preceding section, the optimal parameter set of the shrinkage curve comes from the training set. Therefore, the coefficient in (14) is related to two factors, namely, the training sample and noise level . Denoting as , (14) can then be rewritten as
In the transform domain, the high-frequency components can be effectively filtered out by using the shrinkage curve determined by the coefficient . Then, we obtain a denoiser that is related to , this denoiser is denoted as , where the subscript shows that the denoiser is related to the training set and the given noise level . The boosting algorithm of the denoiser is discussed as follows, and it is different from the linear denoiser reported by Fang [13]. The denoiser is a comprehensive denoiser composed of a series of nonlinear shrinkage operators and is a typical nonlinear denoiser.
To boost the effect of and overcome the dependence of on an external database and noise level, we use the following steps to boost :(1)The noise level of noisy image is estimated.(2) and are initialized, using the constant value image to train the shrinkage curve; that is , and then and the initial denoised image can be calculated.(3)The denoised image is added to the input noise image, and the noise level of the accumulated image is calculated. A new denoiser is obtained by training the shrinkage operator with the denoised image.(4)Step (3) is repeated until the denoising result converges.
The above process is described by the following equation:where controls the denoising strength, controls the return ratio of clean images, and is the noise level of . Many algorithms estimate the latter, e.g., PCA [24], WT [25], among others [26, 27]. In this paper, the noise level is estimated by the algorithm reported in [27]. The specific algorithm corresponding to the process is shown as algorithm 1.
| 
 | ||||||||||||||||||||||
3.2. Convergence Analysis of Proposed Algorithm
The convergence of the proposed algorithm 1 is discussed here. First, the definition of a shrinkage denoiser is given as follows:
Definition 1 (Shrinkage denoiser). The denoiser with denoising strength is called a shrinkage denoiser if it satisfies
Definition 2. (Bounded denoiser) (see [28]). A bounded denoiser with a parameter  as a function  such that, for any input ,for some universal constant  independent of  and .
From the analysis in Section 2.2, we can see that  is the result of a finite number of shrinkage operators.
Proposition 1 (see [27]). There exists a constant satisfying .
Proposition 2. The sequence convergence and
Proof. From Proposition 1, there exists , s.t. . Dividing both sides by , and letting , we have .
Based on these definitions and properties, the following properties can be obtained:
Proposition 3. For a given positive integer , the following inequality holds: and .
Proof. First, we prove that . From (16), we haveFor  is a shrinkage denoiser, using the inequality in Definition 1, we haveWe take equations (20) to equation (19), after some simplifications, we obtainThen the recurrence law is obtained as follows:After further simplification, we obtainNext, we prove that: .
BecauseEquation (24) is inserted (25), we haveThis completes the proof.
Theorem 1. If is a bounded and shrinkage denoiser, then the solution sequence obtained from the iterative equation (15) in algorithm 1 is convergent; that is when .
Proof. From equation (15), we haveUsing trigonometric inequality, we haveFor the first and second items on the right-hand side of (28), using the definition of a bounded denoiser given in Section 3.2, we obtainInserting into the last item on the right-hand side of (28), we obtainUsing Proposition 3, we haveInserting (29) and (31) into (28), we haveBecause and are bounded, we utilize Proposition 2, take the limit on both sides of equation (32), and obtain when .
4. Experimental Results and Analysis
The values of the two parameters of algorithm 1 were determined by trial and error, and the properties of Proposition 2 were verified by experiments. All the methods are implemented in MATLAB (16 GB RAM, 3.00 GHz CPU, and an Intel(R) Core(TM) I7-9700 processor).
4.1. Parameter Setting
In proposed algorithm 1, the parameter is used to control the weight of the clean image to return to the next iteration. The larger the value of , the higher the signal-to-noise ratio (SNR) of from the iteration . is used to control the intensity of noise added to the training sample. The larger the value of , the greater the noise level added to the training sample, and the stronger the denoising ability of the trained shrinkage curve to the high noise image. On the contrary, the smaller the value of , the more suitable the trained shrinkage curve is for low noise image denoising. In this paper, and are selected as follows: we limit to the interval , and to the interval . We take the point in the two intervals in steps of 0.1 to obtain the parameter pair . The standard test image (as shown in Figure 1) is added with Gaussian noise with a mean value of 0 and variance of to obtain the noise image. These parameter pairs are used in into algorithm 1 to denoise the noise image. Figure 2 shows the relationship among , , and PSNR. It can be seen that the PSNR value is the highest and the denoising effect is the best when and . In the experiments described as follows, the parameter values in algorithm 1 were set as and .


4.2. Verification of Shrinkage Properties
Proposition 2 plays a key role in the process of proving the convergence of algorithm 1. The convergence of the sequence generated in algorithm 1 is verified in this subsection. For all images and noise levels, the sequence converges to 0 just like the curve in Figure 3(a), and no difference exists. As can be seen from the figure, with increasing iteration number , the sequence converges rapidly to 0. These experimental results further verify the validity of Proposition 2. Correspondingly, considering the change in PSNR value after each step of the iteration, the results are given in Figure 3(b). It is found that the PSNR value obtained by algorithm 1 converges to a stable state quickly, and it is also shown that algorithm 1 only needs a few iterations to achieve a good denoising effect. Therefore, the proposed algorithm can be terminated directly by truncating the number of iterations, thus avoiding the extra operation of using blind image quality evaluation to determine the optimal iteration number. In the experiments discussed later in this paper, the number of iterations was set to 9.

(a)

(b)
4.3. Comparison of Shrinkage Curves
The training of shrinkage curves plays an important role in the proposed algorithm 1. In Figure 4 the differences between the original algorithm in [19] and algorithm 1 proposed in this paper are compared. In this experiment, the size of the image patch is set to . Therefore, a total of 64 shrinkage curves can be obtained, the numbers of which are denoted from 1 to 64. Owing to page-length limitations and for the convenience of comparison, only four curves in the lower right corner, numbered 55, 56, 63, and 64, are compared here. The reason for choosing these four curves is that the high-frequency components of the image patch are mainly concentrated in this part after the DCT transformation. Figure 4 shows the shrinkage curves used by different algorithms to denoise the cameraman image with the noise level of .

(a)

(b)

(c)

(d)
Subgraph (a) of Figure 4 shows the shrinkage curve used in [19], while subgraphs (c)-(d) show the shrinkage curves used in algorithm 1 for two, five, and nine iterations. The shrinkage curves of the two algorithms hardly operate on the DC part of the upper left-hand corner, but for the AC part, the curve used in [19] shrinks the high-frequency part, and the closer the shrinkage curve is to the high-frequency part, the stronger the shrinkage curve. The curve used in each iteration of the proposed algorithm 1 has the same trend as the curve in [19], but in the low-frequency part, the curve shrinkage in algorithm 1 is stronger than that in [19]. We note that the curve numbered 55 in Figure 4(a), the shrinkage in [19] is weak, while the shrinkage curve numbered 55 in Figures 4(b)–4(d) increased gradually using the proposed algorithm.
A similar conclusion can be obtained for curve 56. This shows that the denoising performance of the algorithm is continuously boosted in the successive iteration process, and the purpose of boosting and denoising is achieved. In terms of the contracted frequency band, e.g., curve 64, the frequency band contracted in [19] is wider. In the iteration of the proposed algorithm, for example, the frequency band of curve 64’s contraction from Figure 4(b) to 4(d) is narrower, that is to say, the proposed method achieves a better denoising effect by several iterations of weaker but finer shrinkage.
4.4. Comparison of Denoising Results
The complexity and the actual denoising effect of algorithm 1 are discussed in this subsection.
The complexity of algorithm 1 is mainly produced by three processes: shrinkage curve learning, denoising, and noise level estimation. When the image size is and the patch size is , the computational complexity of the shrinkage curve learning and denoising processes is , and the computational complexity of noise level estimation is . Therefore, the overall computational complexity of algorithm 1 is , where is the number of iterations.
The effectiveness of algorithm 1 is illustrated further using the results of three groups of comparative experiments. The first group of experiments used 10 images, which are shown in Figure 1. The second used five texture images from the MeasTex [29] database, and the third used 25 color images from the Tid2008 [30] database.
First, using the test image in Figure 1, Gaussian noise with noise levels of 5, 10, 15, 20, 25, and 50 was added to each clean image, and then the noisy image was obtained. We used the denoising algorithm advanced in [19] and carried out image denoising experiments for three cases: known noise level, estimated noise level, and algorithm 1. To avoid errors, the average value of many experiments was taken as the final result, and the results (PSNR values of noisy images under different noise levels) are presented in Table 1. In the table, the value of the column heading represents the denoising result using the algorithm reported in [19] when the noise level is known. The numerical value of the column headed of represents the result of image denoising when the noise level is the estimated result. The denoising performance of the two algorithms is sensitive to the training image. When the structure of the training sample is similar to the noise image, the denoising effect is the best. Therefore, for the two algorithms, the denoised image itself was used as the training sample to achieve the best denoising effect and ensure the fairness of the comparative experiments. The value of the column headed “Our” represents the result of Algorithm 1. The column labeled “impv” represents the difference between the results obtained by algorithm 1 and those obtained using the algorithm in [19]. It can be seen from the table that algorithm 1 can achieve the highest PSNR value for almost all images and all noise levels. Compared with other algorithms, when the proposed algorithm 1 is used to process the house image with a noise level of 10, the PSNR value is improved to a maximum of 1.86 dB.
From the actual processing effect, the proposed algorithm 1 also has the advantage of preserving image details. As shown in Figure 5, the first column represents the original image. The second column represents the noisy image, and its noise level is shown in the figure. The third column is the result obtained by using the algorithm presented in [19] with an accurate noise level. The fourth column lists the results that use the estimated noise level. The last column is the result of algorithm 1. It can be seen from Figure 5, for example, that the edge of a hat in the Lena image, the ridgeline in the house image, and the pepper’s contour in the pepper image is clearer than when processed by other algorithms. In the fingerprint image, the overall texture of the denoised image is clearer. In the montage image, even if high-level noise is added, the edge line of the montage image is still clearer.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)

(p)

(q)

(r)

(s)

(t)

(u)

(v)

(w)

(x)

(y)
Next, we considered the influence of texture images on the proposed algorithm by using five images from the MeasTex [29] database, i.e., fabric, food, leaves, water, and wood. All five images contain rich texture features. For each clean image, Gaussian noise with noise levels of 5, 10, 15, 20, 25, and 50 was added. The results of image denoising using algorithm 1 are shown in Table 2, from which it can be seen that the greatest improvement of algorithm 1 compared with the algorithm given in [19] is 4.28 dB, and the average improvement is 1.49 dB. Thus, compared with the algorithm reported in [19], the proposed algorithm 1 significantly improved the denoising of richly textured images.
Finally, the influence of color images in natural scenes on the algorithm was considered using 25 images from the Tid2008 [30] database, that contained different scenes and characters; Gaussian noise was added to each image. The results of image denoising using algorithm 1 are shown in Table 3, from which it can be seen that algorithm 1 significantly improved the denoising effect on all images selected from the Tid2008 database. Moreover, compared with the algorithm presented in [19], the proposed algorithm 1 was able to better maintain the details of color images. This can be seen from the comparison in Figures 6(c) and 6(d). Note further that the edge of the hat in Figure 6(e) is closer to the edge of the hat in Figure 6(a) without noise, and there is no jagged edge. Compared with Figures 6(h) and 6(i), the boundaries between water and beach and between forest and sky in Figure 6(j) are clearer. Similarly, compared with Figures 6(m) and 6(n), the parrot image in Figure 6(o) maintains a better texture and edge. It can therefore be seen that algorithm 1 has a better noise-removal effect for images of various natural scenes and has a better ability to maintain details than the algorithm presented in [19].

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

(k)

(l)

(m)

(n)

(o)
5. Conclusion and Future Work
A shrinkage curve learning denoising algorithm is an important kind of denoising algorithm, and an algorithm constructed by a shrinkage curve has typical nonlinear properties. However, the algorithm proposed in this paper can still boost shrinkage curve learning denoising. The proposed algorithm combines noise-level estimations and makes the denoising intensity of the denoiser match the noise image. Therefore, the denoising ability of the proposed algorithm is improved. Moreover, the algorithm proposed in this paper does not rely on external training samples, thus boosting the adaptability of the algorithm. Finally, the convergence of the proposed algorithm is discussed and the conditions for judging the convergence of the algorithm are obtained. Of course, a significant amount of time is taken in the proposed algorithm to train samples, mainly to solve a family of optimal shrinkage curves in the sense of least squares. If we do not pay attention to the analytical solutions of these curves, some of the most representative computational intelligence algorithms can be used to solve the outstanding problems, including Harris hawks optimization [31], monarch butterfly optimization [32], and the earthworm optimization algorithm [33], among others. The moth search [34] algorithm can also solve outstanding problems and can possibly improve the efficiency of the algorithm. In addition, the establishment of a unified algorithm, so that the boosting algorithm can adapt to linear and nonlinear noise reduction, is also an approach that can be studied in the future. Finally, the denoiser in the proposed algorithm should be associated with the noise level , which controls the denoising strength. How to improve the proposed algorithm if the denoiser is independent of commands requires further study.
Data Availability
The data and the MATLAB codes used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
This study was supported by the National Natural Science Foundation of China (Grant nos. 61763009, 61761030, and 62061016), the Doctoral Scientific Fund Project of Hubei Minzu University (Grant no. MD2020 B024), the High-level scientific research Achievement Cultivation project of Hubei Minzu University (Grant no.4205003), and the Nature Science Foundation of Enshi Polytechnic (Grant no. EZYQNZK201904). The authors would like to express our gratitude to the anonymous reviewers and editors for their valuable comments and suggestions, which led to the improvement of the original manuscript.