Abstract
We introduce a preconditioning technique for the first-order primal-dual splitting method. The primal-dual splitting method offers a very general framework for solving a large class of optimization problems arising in image processing. The key idea of the preconditioning technique is that the constant iterative parameters are updated self-adaptively in the iteration process. We also give a simple and easy way to choose the diagonal preconditioners while the convergence of the iterative algorithm is maintained. The efficiency of the proposed method is demonstrated on an image denoising problem. Numerical results show that the preconditioned iterative algorithm performs better than the original one.
1. Introduction
Many real-world application problems arising in signal and image processing [1–3], machine learning [4–6], and medical image reconstruction [7, 8] can be modeled as solving some convex optimization problems (maybe nonsmooth). In recent years, the optimization of the sum of two convex functions has received much attention, which takes the form ofwhere , , and is a linear transformation matrix. Here denotes the set of proper, lower semicontinuous convex functions from to . The functions and in (1) usually denote the data error term and the regularization term, respectively.
Chambolle and Pock [9] proposed a general primal-dual method to solve problem (1). Under the assumption that the functions and have a closed-form solution of proximity operator, they proved the convergence of the proposed iterative algorithms. He and Yuan [10] have shown that the primal-dual method of Chambolle and Pock [9] is equivalent to the proximity point algorithm (PPA). Thus, the original convergence analysis can be easily obtained by the well-known PPA method. In order to accelerate the primal-dual method [9], Pock and Chambolle introduced in [11] a preconditioning technique for the primal-dual iterative algorithm for which the constant iterative parameters were replaced by some precondition iteration matrices. Then, the convergence of the preconditioned iterative algorithm followed directly from the PPA method. They also gave a practical way to choose the precondition iteration matrix. As an application, Sidky et al. [12] applied the primal-dual method of Chambolle and Pock [9] to solve a variety of application problems arising in medical image reconstruction. The primal-dual method and the corresponding precondition method presented the performance very well. Some related works can also be found in [13, 14].
If the objective function in (1) has a differentiable term with Lipschitz continuous gradient, such as the least squares loss function, the primal-dual method introduced by Chambolle and Pock [9] did not use the gradient operator of the function. In order to solve such a more general problem, Combettes and Pesquet [15], Condat [1], and Vũ [16] introduced a primal-dual method. For instance, Condat [1] considered an optimization problem involving the sum of three convex functions, with a smooth function with Lipschitz continuous gradient, a nonsmooth proximable function, and linear composite functions. The problem is presented below:where , , , is differentiable, and its gradient is Lipschitz continuous with Lipschitz constant ; is a linear transformation matrix. To solve problem (2), he proposed a primal-dual splitting method and proved the convergence of the new iterative algorithm in an infinite-dimensional Hilbert space based on the fixed point theory of nonexpansive mappings.
The purpose of this paper is to introduce a preconditioned primal-dual splitting method to solve problem (2). The advantage of the preconditioning technique is that the iterative parameters will be updated self-adaptively. Furthermore, we give a family of preconditioners which are restricted to diagonal matrices and guarantee the convergence of the algorithm. To illustrate the efficiency of the proposed method, we compare it with the original method on image denoising problem. In addition, Combettes et al. [17] recently proposed a variable metric primal-dual method; they also referred to the preconditioning technique. But our algorithm is different from the method proposed by Combettes et al. In our algorithm, we use two different metrics: one for the primal variable; one for the dual variable(s). The other difference is in [17], which is a smooth term in the dual problem. When it is zero, the convergence conditions are stronger than the ones presented here. What is more, we explain how to choose the variable metric.
The rest of the paper is organized as follows. In Section 2, we briefly review the primal-dual splitting method proposed by Condat [1]. In Section 3, we present the preconditioning technique and provide a practical way to choose the iterative parameters matrices. In Section 4, we make several experiments in image denoising problems. Finally, we make a brief conclusion on this paper.
2. A Primal-Dual Splitting Method of Condat [1]
In this section, we briefly review the primal-dual splitting method introduced by Condat [1]. First, we introduce some definitions and notations. Let be a real Hilbert space, with its inner product and norm . We denote by the set of proper, lower semicontinuous, convex functions from to . Let ; its Fenchel conjugate is defined by and its proximity operator by ; is a positive constant. We define the subdifferential of as the set-valued operator . If is differential at , then . Let and be two real Hilbert spaces and be a bounded linear operator with adjoint and induced norm
Condat [1] considered solving the following optimization problem:where is convex, differentiable on , and its gradient is -Lipschitz continuous, for some ; that is, , . and , and their proximity operator have a closed-form solution. is a bounded linear operator with adjoint . The dual formulation of the primal problem (4) isThe corresponding saddle point of primal problem (4) and dual problem (5) is as follows:The pair can be found via the following monotone variational inclusion:where and are the subdifferential of and , respectively.
Condat [1] proposed the following iterative algorithm to solve problem (4).
Algorithm 1 (primal-dual splitting method (PDS) for solving problem (4)). Choose the proximal parameters , and the relaxation parameters Give an initial value . For , the iterative and are updated by(1);(2);(3),where the error terms , , and model the inexact computation of the operators , , and , respectively.
If some stopping criteria have been reached, then the algorithm stops.
The convergence of Algorithm 1 was ensured by the following theorem.
Theorem 2 (see [1]). Let , , and the sequences , , , and be the parameters of Algorithm 1. Let be the Lipschitz constant of . Suppose that and the following conditions hold:(i);(ii), where ;(iii);(iv), , and . Then the sequences and generated by Algorithm 1 converge weakly to of (4) and of (5), respectively.
Remark 3. (1) Due to Moreau’s identity , the proximity operator can be computed from .
(2) The role of primal variable and dual variable in Algorithm 1 can be exchanged. The convergence of the new iterative algorithm can also be ensured, according to the same Theorem 2. In practice, the performance of the two iterative algorithms is nearly the same.
3. A Preconditioned Primal-Dual Splitting Method for Solving (4)
In this section, we give a precondition version of Algorithm 1. The main idea is motivated by the work of Pock and Chambolle [11]. The iterative parameters in Algorithm 1 are replaced by some symmetrical positive matrices. First, we give the detailed iterative algorithm below. Then we will prove its convergence.
Algorithm 4 (preconditioned primal-dual splitting method (PPDS) for solving problem (4)). Choose the symmetrical positive define matrices , and the relaxation parameters . Give an initial value . For , the iterative and are updated by (1);(2);(3),where the error terms , , and model the inexact computation of the operators , , and , respectively.
If some stopping criteria have been reached, then the algorithm stops.
Theorem 5. Let be a symmetrical positive matrix and the sequences , , , and be the parameters of Algorithm 4. Let be the Lipschitz constant of . Suppose that and the following conditions hold:(i), .(ii), where .(iii).(iv), , and . Then the sequences and generated by Algorithm 4 converge weakly to of (4) and of (5), respectively.
Proof. The main idea of the proof is based on the method of Theorem 2. First, we give some definitions and notations. Let with inner product : Then, is a real Hilbert space with the defined inner product.
Define . It follows from the proof of Theorem 2; we know that the iterative sequences (1)–(3) in Algorithm 4 could be rewritten as follows: where , , , , and
Notice that, from condition (i), it is easy to check that is bounded, self-adjoint, and strictly positive; that is, , for every . Hence, we can define another inner product and in as Define and ; we will prove that is -Lipschitz continuous. In fact, for any , we haveLet us define a linear operator . Then, is positive definite. So,Substituting (12) into (11), we have The rest of the convergence proof follows the same argument of Theorem 2. So we omitted it here. This completes the proof.
As a matter of fact, and in Theorem 5 could be any symmetric, positive definite maps. In order to ensure that the proximity operator of and has a closed-form solution, it is sufficient to choose diagonal matrices for both of them. In the following, we give a practical way to choose the symmetric positive matrices and ensure the convergence of Theorem 5. To facilitate our proof, we need the following lemmas, which were obtained by [11].
Lemma 6 (see [11]). Let be a well-defined matrix and and be symmetric positive definite maps satisfying . Define the matrix as Then the matrix is symmetric and positive definite.
Therefore, we are in the position to give our way to choose the matrices and accordingly.
Lemma 7. Assume that is a diagonal matrix with , where is the Lipschitz constant of . Fix ; let diagonal matrices and with and then it holds that
Proof. Conclusion (16) follows from the definition of diagonal matrices and . In fact, We will prove (17). It is easy to see that the proof of (17) is equivalent to the positive definite map of the following matrix : From Lemma 6, it is sufficient to provewhich ensure that the matrix is positive definite. For any , let us define . We haveBy the definition of the operator norm, we obtain Finally, the strict inequality of (20) can be obtained from the above proof process with one of the above inequalities becoming strictly smaller.
4. Applications
In this section, we present an application of our proposed iterative algorithm. We aim at solving the following constrained total variation (TV) denoising problem:where is a noisy image which was contaminated by Gaussian noise, is the regularization parameter, and is a closed convex set representing the prior information of the denoised image. By using the indicator function, constrained (TV) denoising problem (23) could be formulated by the following unconstrained optimization problem as follows:where the indicator function , since the total variation term can be represented by a combination of convex function and linear transformation matrix ; that is, . See, for example, [18]. Then, optimization problem (24) is actually a special case of the general optimization problem (2) with , , and . Note that the gradient of the function is and the Lipschitz constant .
If the constrained set , then the constrained TV denoising problem reduces to the unconstrained TV denoising problem:The above TV denoising problem is often referred to as the ROF denoising model which was first introduced in computer vision by Rudin et al. [19].
4.1. Numerical Experiments
In the following, we present some preliminary numerical results and show the efficiency of our proposed methods. All the experiments are run on a personal Lenovo computer with Pentium(R) Dual-Core CPU @ 2.8 GHz and RAM 4 GB.
For all the tested iterative algorithms the stopping criterion iswhere is a given small constant, or the maximum iteration numbers are reached. The reconstructed image is evaluated in terms of the signal to noise (SNR) defined bywhere is the original clear image and is the reconstructed image. The reconstructed time is denoted by “” and the iteration number “Iter” is recorded when the stopping criteria satisfied. The tested images are the well-known “Barbara,” “Boat,” and “Lena”. These images have the same size of and are displayed in Figure 1.

(a)

(b)

(c)
Experiment 1. We present how the iterative parameters are chosen. First, we choose different combination of parameters and and then apply Algorithm 1 to solve image denoising problem (23). We choose “Barbara” as the test image and add it by random Gaussian noise with zero mean and standard deviation . For convenience, we define , . Here, the Lipschitz constant and . The numerical results are reported in Table 1.
Meanwhile, we plot the SNR versus the iteration numbers in Figure 2. We can see from Figure 2 that when the iterative parameter and , the greater , the faster the convergence speed. For the small , it has no apparent difference by the choice of parameter . So we select and for Algorithm 1 for the comparative experiments.
For our proposed Algorithm 4, the corresponding preconditioned iterative matrices are chosen according to Lemma 7. There is only one parameter that needs to be set, and the experiments results are reported in Table 2. Figure 3 shows the SNR value versus the iteration number. We can see from it that the performance of the three choices of is nearly the same. So we choose for Algorithm 4 in the following test.

(a)

(b)

(a)

(b)
Experiment 2. We show the performance comparison between Algorithm 1 and our proposed Algorithm 4. The constraint set is set as nonnegative set; that is, . To perform fair comparison, we add each of these images with random Gaussian noise with mean value and different level of standard variation . The regularization parameter when the noise level and when .
From Table 3, we can see that our proposed Algorithm 4 converges faster than Algorithm 1 in terms of iteration numbers and iteration time in CPU time. For the large noise level, Algorithm 4 reaches higher SNR value than Algorithm 1 with less iteration numbers. Both of the algorithms get the cleared image, which have nearly the same SNR value finally. To visualize the reconstructed images, we present the noised image and the final denoising image in Figures 4, 5, and 6, respectively.



5. Conclusion
In this paper, we have studied the general optimization problem with the sum of three convex functions which is composed of a differential function with Lipschitz continuous gradient, a proximable function, and a linear composition function. Many interesting problems arising in image restoration and image reconstruction are special case of this problem. Inspired by the preconditioning technique proposed by Pock and Chambolle, we have introduced a primal-dual splitting algorithm with self-adaptive step-size to solve such problem. We also proposed a practical way to choose these step-sizes with a proof of convergence. Numerical results on image denoising problem showed that the precondition iterative algorithm performs better than the original one with constant step-size.
Competing Interests
The authors declare that there is no conflict of interests regarding the publication of this article.
Acknowledgments
This work was supported by the National Natural Science Foundations of China (11401293, 11461046, and 11661056), the Natural Science Foundations of Jiangxi Province (20151BAB211010, 20142BAB211016), the China Postdoctoral Science Foundation (2015M571989), and the Jiangxi Province Postdoctoral Science Foundation (2015KY51).