Abstract
In this paper, we considers the separable convex programming problem with linear constraints. Its objective function is the sum of individual blocks with nonoverlapping variables and each block consists of two functions: one is smooth convex and the other one is convex. For the general case , we present a gradient-based alternating direction method of multipliers with a substitution. For the proposed algorithm, we prove its convergence via the analytic framework of contractive-type methods and derive a worst-case convergence rate in nonergodic sense. Finally, some preliminary numerical results are reported to support the efficiency of the proposed algorithm.
1. Introduction
In this paper, we consider the following convex minimization model with linear constraints and separable objective function:where are closed proper convex functions and are smooth convex functions, are closed convex sets, are given matrices, and is a given vector. Furthermore, we assume that each has Lipschitz-continuous gradient, i.e., there exists such that
Throughout the paper, the solution set of (1) is assumed to be nonempty.
A fundamental method for solving (1) in the case of is the alternating direction method of multipliers (ADMM), which was presented originally in [1, 2]. We refer to [3, 4] for some review papers on ADMM. There are many problems of form (1) with in contemporary applications, such as the robust principal component analysis model [5], the total variation-based image restoration problem [6], the superresolution image reconstruction problem [7, 8], the multistage stochastic programming problem [9], the deblurring Poissonian images problem [10], the latent variable Gaussian graphical model selection [11], the quadratic discriminant analysis model [12], and the electrical engineering [13, 14]. Then, our discussion focuses on (1) in the case of .
A natural idea for solving (1) is to extend the ADMM from the special case to the general case . This straightforward extension can be written as follows:
The convergence of (3) is proved in some special cases (see [15–17]). Unfortunately, without further conditions, the direct extension of ADMM (3) for the general case may fail to converge (see [18]). In [19, 20], the authors present two convergent semiproximal ADMM for two types of 3-block problems. Recently, He et al. [21] showed that if a new iterate is generated by correcting the output of (3) with a substitution procedure, then the sequence of iterates converges to a solution of (1). Since then, several variants of the ADMM were proposed for solving (1) (see [21–26]).
In (3), all the -related subproblems are in the form ofwith a certain known and a symmetric positive definite matrix . When is not the identity matrix, problem (4) becomes complicated. A popular technique is to linearize the quadratic term of (4) (see [27, 28]), that is, one can solve the following problem instead of (4):with a certain known . In general, one can solve the following problem instead of (4):where is the current iteration. If , then (6) becomes the form of (5).
Since is smooth, the following problem is easier than (6):
Now, we can give the gradient-based ADMM (G-ADMM) iterative scheme as follows:
In this paper, imal ADMM with a substitution based on (8). In Section 2, we provide some preliminaries for further analysis. Then, we present the gradient-based alternating direction method of multipliers with a substitution (G-ADMM-S) for solving (1) and its convergence is shown in Section 3. In Section 4, we estimate the worst-case iteration complexity for the proposed algorithm in nonergodic sense. In Section 5, some preliminary numerical results are reported to support the efficiency of the proposed algorithm. Finally, some conclusions are given in Section 6.
2. Preliminaries
In this section, we provide some preliminaries. Let and . denotes that is a positive definite (semidefinite) matrix. For any positive definite matrix , we denote as the -norm. If is the product of a positive parameter and the identity matrix , i.e., , we use a simpler notation: . Let . The domain of denoted by . We say that is convex if
For convex function , the subdifferential of is the set-valued operator defined by
2.1. Variational Characterizations of (1)
Let , , and . Since all are convex functions, by invoking the first-order necessary and sufficient condition for convex programming, one can easily find out that problem (1) is characterized by the following variational inequality: we obtain and such thatfor all .
The Lagrange function of (1) is given by
Let be a saddle point of the Lagrange function . That is, for any and ,
Finding a saddle point of is equivalent to finding a such that
Let , and
Then, (14) can be rewritten as the following variational inequality (VI): we obtain such thatLet be the solution set of . Since we have assumed that the solution set of (1) is nonempty, is also nonempty. It follows from the definition of that
2.2. Some Notations
Let , , , , and . Let and () be given positive definite matrices, . denotes the maximum eigenvalue of one matrix, and denotes the minimum eigenvalue of one matrix. The following notions will be used in the later analysis:
It is easy to see that
3. Algorithm and Convergence Analysis
In this section, we first describe G-ADMM-S and then prove its convergence via the analytic framework of the contractive-type method [29]. Throughout this section, we assume that (). We propose the iterative scheme of G-ADMM-S for solving (1) in Algorithm G-ADMM-S:
Let and be defined in (18) and (19), respectively. Start with . With the given iterate , the new iterate is given as follows: Step 1 (G-ADMM procedure). Execute scheme (8) to generate . Step 2 (substitution procedure). Generate the new iterate via where
Next, we establish the global convergence of Algorithm G-ADMM-S following the analytic framework of contractive-type methods. We outline the proof sketch as follows:(1)Prove that is a descent direction of the function at the point whenever , where is generated by G-ADMM scheme (8) and (2)Prove that the sequence generated by Algorithm G-ADMM-S is contractive with respect to (3)Establish the convergence
Accordingly, we divide the convergence analysis into three sections to address the claims listed above.
3.1. Verification of the Descent Direction
In this section, we show that is a descent direction of the function at the point whenever and . For this purpose, we first prove an important inequality for the output of G-ADMM procedure (8), which will be used often in our further discussion.
Theorem 1. andwhere .
Proof. By the optimality condition of the -related subproblem in (8), for , we have andwhere is the indicator function of the set . Thus, and there exists such thatwhere . From the subgradient inequality, one hasFrom the definition of , one hasThat is,for all . Substituting (see (8)) in the above inequality, we obtainfor all . Summing the above inequality over , we obtainwhereThen, by adding the following termto both sides of (30), we getSince , we haveCombining the above two formulas, we havewhereUsing the notations of (see (15)) and (see (18)), assertion (23) is proved.
Based on assertion (23), we can get the following result.
Corollary 1.
Proof. It follows from (23) thatUsing (17) and the optimality of , we haveThus,Since and ,The next theorem implies that is a descent direction of the function at the point whenever .
Theorem 2. For all ,
Proof. It follows from (37) thatThat is, . Now, we treat the first term of the right-hand side of (43):where the first inequality follows from the Lipschitz continuous of . Then, let us deal with the second term of the right-hand side of (43):Thus,where
The assertion follows from the above three formulas.
Since and , whenever , assertion (42) shows the positivity of the term , and thus, the direction is a descent direction of the function at the point .
3.2. Contractive Property
In this section, we show that the sequence generated by Algorithm G-ADMM-S is contractive with respect to the set .
Since the direction is a descent direction of the function at the point , the new iterate can be generated byThus,where the inequality follows from the first inequality of (42).
Let . Note that right-hand side of (49), i.e., , is a quadratic function of , andIn order to obtain the closest proximity to , we are in the desire to maximize this quadratic function and this promotes us to take the optimal value of asWith this choice of step size, it follows from (49) that
Let
From the definition of (see, (18)), one haswhere
It follows from (19) and (42) that
It is easy to see that .
Next, we show that the sequence generated by Algorithm G-ADMM-S is contractive with respect to the set .
Theorem 3. Let the sequence be generated by the proposed Algorithm G-ADMM-S. Then,
Proof. Using (21) and (22), we obtainwhere the first inequality follows from the first inequality of (42) and the second inequality follows from (54)–(56).
3.3. Convergence Result
In this section, we establish the global convergence for Algorithm G-ADMM-S based on the analytic framework of contraction methods in [29].
Theorem 4(global convergence). Let the sequence be generated by Algorithm G-ADMM-S. Then, there exists such that
Proof. It follows from (57) that the sequence is bounded andwhich implies that .
Since is bounded, the sequence has at least one cluster point and we denote it by . In addition, let be the subsequence converging to . Since , converges to .
By taking the limit over in (15), we have thatTherefore, is a solution point of . By using (57), we haveand thus, .
4. Iteration Complexity
In this section, we will show that after iterations of Algorithm G-ADMM-S, we can ensure thatwhere . Thus, a worst-case iteration complexity is established in nonergodic sense for Algorithm G-ADMM-S.
Theorem 5. Let the sequence be generated by Algorithm G-ADMM-S. Then,where .
Proof. It follows from (57) thatThus, for any integer , we obtainand consequently, we obtain assertion (64).
Recall that is convex and closed under our assumptions (see, Theorem 2.3.5 in [30]). LetFor any given , inequality (64) indicates that Algorithm G-ADMM-S requires at most iterations to fulfill the requirement .
5. Numerical Results
To investigate the numerical performance of the proposed algorithm, we apply it to solve a convex quadratic programming and a nonlinear convex programming with separable structure and report some preliminary numerical results. All codes were written by Matlab 2016a, and all the numerical experiments were conducted on a Dell desktop computer with Intel Pentium Intel (R) Core processor 3.30 GHz and 4 GB memory.
5.1. Quadratic Programming Problem
First, we consider the following quadratic programming problem:In the experiments, we set and , (). We set the matrix and construct the rest of matrices in a way similar to [25, 31]. That is, , where are random matrices andIn our tests, we set and generate the matrices in Matlab function. In the experiments, we set and the radius . For the linear constraint, the entries of () are uniformly distributed in with the density 0.1 and . ’s are given generated by () in the Matlab function. In order to guarantee the feasibility of the problem, we set , () and . Thus, is an optimal solution of (68). The Algorithm G-ADMM-S is compared with the PPSM-C in [25]. The initial iteration points are the zero vectors () and for all tested algorithms. We set a maximal number of 20000 for iteration of the proposed algorithms with a modified stopping criterion as follows:
Now, we specify the choices of parameters to implement these algorithms. First, we set with and the relaxation parameter for all tested algorithms. For “PPSM-C″, we set (), where represents the Frobenius norm. For G-ADMM-S, we consider two cases of the matrices (): Case 1: with Case 2: with
In order to investigate the stability and efficiency of our algorithms, we test 16 groups of problems with random data. Some preliminary numerical results are reported in Table 1. Since they are synthetic examples with random date, for each scenario, we test 10 times and report the average performance. Specifically, we report the number of iterations (“Iter.”) and the computing time in seconds (“Time”) for all the tested methods. The data in Table 1 show that Algorithm G-ADMM-S ( with ) is more efficient than the rest of algorithms for the test problems.
5.2. Nonlinear Convex Programming Problem
In this section, we consider the following nonlinear convex programming problem:where and is a positive matrix. In the experiments, we set . It is easy to see that, when the PPSM-C in [25] solves (71), there is no explicit solution to the subproblems. In this section, we only use the Algorithm G-ADMM-S to solve (71). In the experiments, the entries of () are uniformly distributed in with the density 0.1 and . , , and is given generated by in the Matlab function. We set the matrix in a way similar to [25, 31]. In order to guarantee the feasibility of the problem, we set and . Thus, is an optimal solution of (71). The initial iteration points are the zero vectors () and for all tested algorithms. We set a maximal number of 20000 for iteration of the proposed algorithms with a modified stopping criterion as follows:
Now, we specify the choices of parameters to implement these algorithms. We set with , the relaxation parameter , , , , and (). We consider two cases of the parameter : Case 1: ; Case 2: .
We test 7 groups of problems with random data. Numerical results are reported in Table 2. For each scenario, we test 5 times and report the average performance. Specifically, we report the number of iterations (“Iter.”), the computing time in seconds (“Time”), and the absolute error of function value (“f-error”). The numerical results show that Algorithm G-ADMM-S is effective.
6. Conclusion
In this paper, for the linearly constrained separable convex programming, whose objective function is the sum of individual blocks with nonoverlapping variables and each block is convex, we present a gradient-based ADMM with a substitution in the case . We have analysed its convergence and iteration complexity. The preliminary numerical results have shown the efficiency of the proposed algorithm.
Data Availability
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The first author would like to extend his sincere gratitude to his doctoral supervisor, Pro. Caozong Cheng, for his guidance.