A Proximal Alternating Direction Method of Multipliers with a Substitution Procedure

Chao, Miantao; Zhao, Yongxin; Liang, Dongying

doi:https://doi.org/10.1155/2020/7876949

Mathematical Problems in Engineering

On this page

Abstract Introduction Preliminaries Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 7876949 | https://doi.org/10.1155/2020/7876949

A Proximal Alternating Direction Method of Multipliers with a Substitution Procedure

Miantao Chao,¹Yongxin Zhao,¹and Dongying Liang²

Academic Editor: Jerzy Baranowski

Received27 Sept 2019

Accepted27 Mar 2020

Published27 Apr 2020

Abstract

In this paper, we considers the separable convex programming problem with linear constraints. Its objective function is the sum of individual blocks with nonoverlapping variables and each block consists of two functions: one is smooth convex and the other one is convex. For the general case , we present a gradient-based alternating direction method of multipliers with a substitution. For the proposed algorithm, we prove its convergence via the analytic framework of contractive-type methods and derive a worst-case convergence rate in nonergodic sense. Finally, some preliminary numerical results are reported to support the efficiency of the proposed algorithm.

1. Introduction

In this paper, we consider the following convex minimization model with linear constraints and separable objective function:where are closed proper convex functions and are smooth convex functions, are closed convex sets, are given matrices, and is a given vector. Furthermore, we assume that each has Lipschitz-continuous gradient, i.e., there exists such that

Throughout the paper, the solution set of (1) is assumed to be nonempty.

A fundamental method for solving (1) in the case of is the alternating direction method of multipliers (ADMM), which was presented originally in [1, 2]. We refer to [3, 4] for some review papers on ADMM. There are many problems of form (1) with in contemporary applications, such as the robust principal component analysis model [5], the total variation-based image restoration problem [6], the superresolution image reconstruction problem [7, 8], the multistage stochastic programming problem [9], the deblurring Poissonian images problem [10], the latent variable Gaussian graphical model selection [11], the quadratic discriminant analysis model [12], and the electrical engineering [13, 14]. Then, our discussion focuses on (1) in the case of .

A natural idea for solving (1) is to extend the ADMM from the special case to the general case . This straightforward extension can be written as follows:

The convergence of (3) is proved in some special cases (see [15–17]). Unfortunately, without further conditions, the direct extension of ADMM (3) for the general case may fail to converge (see [18]). In [19, 20], the authors present two convergent semiproximal ADMM for two types of 3-block problems. Recently, He et al. [21] showed that if a new iterate is generated by correcting the output of (3) with a substitution procedure, then the sequence of iterates converges to a solution of (1). Since then, several variants of the ADMM were proposed for solving (1) (see [21–26]).

In (3), all the -related subproblems are in the form ofwith a certain known and a symmetric positive definite matrix . When is not the identity matrix, problem (4) becomes complicated. A popular technique is to linearize the quadratic term of (4) (see [27, 28]), that is, one can solve the following problem instead of (4):with a certain known . In general, one can solve the following problem instead of (4):where is the current iteration. If , then (6) becomes the form of (5).

Since is smooth, the following problem is easier than (6):

Now, we can give the gradient-based ADMM (G-ADMM) iterative scheme as follows:

In this paper, imal ADMM with a substitution based on (8). In Section 2, we provide some preliminaries for further analysis. Then, we present the gradient-based alternating direction method of multipliers with a substitution (G-ADMM-S) for solving (1) and its convergence is shown in Section 3. In Section 4, we estimate the worst-case iteration complexity for the proposed algorithm in nonergodic sense. In Section 5, some preliminary numerical results are reported to support the efficiency of the proposed algorithm. Finally, some conclusions are given in Section 6.

2. Preliminaries

In this section, we provide some preliminaries. Let and . denotes that is a positive definite (semidefinite) matrix. For any positive definite matrix , we denote as the -norm. If is the product of a positive parameter and the identity matrix , i.e., , we use a simpler notation: . Let . The domain of denoted by . We say that is convex if

For convex function , the subdifferential of is the set-valued operator defined by

2.1. Variational Characterizations of (1)

Let , , and . Since all are convex functions, by invoking the first-order necessary and sufficient condition for convex programming, one can easily find out that problem (1) is characterized by the following variational inequality: we obtain and such thatfor all .

The Lagrange function of (1) is given by

Let be a saddle point of the Lagrange function . That is, for any and ,

Finding a saddle point of is equivalent to finding a such that

Let , and

Then, (14) can be rewritten as the following variational inequality (VI): we obtain such thatLet be the solution set of . Since we have assumed that the solution set of (1) is nonempty, is also nonempty. It follows from the definition of that

2.2. Some Notations

Let , , , , and . Let and () be given positive definite matrices, . denotes the maximum eigenvalue of one matrix, and denotes the minimum eigenvalue of one matrix. The following notions will be used in the later analysis:

It is easy to see that

3. Algorithm and Convergence Analysis

In this section, we first describe G-ADMM-S and then prove its convergence via the analytic framework of the contractive-type method [29]. Throughout this section, we assume that (). We propose the iterative scheme of G-ADMM-S for solving (1) in Algorithm G-ADMM-S:

Let and be defined in (18) and (19), respectively. Start with . With the given iterate , the new iterate is given as follows: Step 1 (G-ADMM procedure). Execute scheme (8) to generate . Step 2 (substitution procedure). Generate the new iterate via where

Next, we establish the global convergence of Algorithm G-ADMM-S following the analytic framework of contractive-type methods. We outline the proof sketch as follows:(1)Prove that is a descent direction of the function at the point whenever , where is generated by G-ADMM scheme (8) and (2)Prove that the sequence generated by Algorithm G-ADMM-S is contractive with respect to (3)Establish the convergence

Accordingly, we divide the convergence analysis into three sections to address the claims listed above.

3.1. Verification of the Descent Direction

In this section, we show that is a descent direction of the function at the point whenever and . For this purpose, we first prove an important inequality for the output of G-ADMM procedure (8), which will be used often in our further discussion.

Theorem 1. andwhere .

Proof. By the optimality condition of the -related subproblem in (8), for , we have andwhere is the indicator function of the set . Thus, and there exists such thatwhere . From the subgradient inequality, one hasFrom the definition of , one hasThat is,for all . Substituting (see (8)) in the above inequality, we obtainfor all . Summing the above inequality over , we obtainwhereThen, by adding the following termto both sides of (30), we getSince , we haveCombining the above two formulas, we havewhereUsing the notations of (see (15)) and (see (18)), assertion (23) is proved.

Based on assertion (23), we can get the following result.

Corollary 1.

Proof. It follows from (23) thatUsing (17) and the optimality of , we haveThus,Since and ,The next theorem implies that is a descent direction of the function at the point whenever .

Theorem 2. For all ,

Proof. It follows from (37) thatThat is, . Now, we treat the first term of the right-hand side of (43):where the first inequality follows from the Lipschitz continuous of . Then, let us deal with the second term of the right-hand side of (43):Thus,where

The assertion follows from the above three formulas.

Since and , whenever , assertion (42) shows the positivity of the term , and thus, the direction is a descent direction of the function at the point .

3.2. Contractive Property

In this section, we show that the sequence generated by Algorithm G-ADMM-S is contractive with respect to the set .

Since the direction is a descent direction of the function at the point , the new iterate can be generated byThus,where the inequality follows from the first inequality of (42).

Let . Note that right-hand side of (49), i.e., , is a quadratic function of , andIn order to obtain the closest proximity to , we are in the desire to maximize this quadratic function and this promotes us to take the optimal value of asWith this choice of step size, it follows from (49) that

Let

From the definition of (see, (18)), one haswhere

It follows from (19) and (42) that

It is easy to see that .

Next, we show that the sequence generated by Algorithm G-ADMM-S is contractive with respect to the set .

Theorem 3. Let the sequence be generated by the proposed Algorithm G-ADMM-S. Then,

Proof. Using (21) and (22), we obtainwhere the first inequality follows from the first inequality of (42) and the second inequality follows from (54)–(56).

3.3. Convergence Result

In this section, we establish the global convergence for Algorithm G-ADMM-S based on the analytic framework of contraction methods in [29].

Theorem 4(global convergence). Let the sequence be generated by Algorithm G-ADMM-S. Then, there exists such that

Proof. It follows from (57) that the sequence is bounded andwhich implies that .
Since is bounded, the sequence has at least one cluster point and we denote it by . In addition, let be the subsequence converging to . Since , converges to .
By taking the limit over in (15), we have thatTherefore, is a solution point of . By using (57), we haveand thus, .

4. Iteration Complexity

In this section, we will show that after iterations of Algorithm G-ADMM-S, we can ensure thatwhere . Thus, a worst-case iteration complexity is established in nonergodic sense for Algorithm G-ADMM-S.

Theorem 5. Let the sequence be generated by Algorithm G-ADMM-S. Then,where .

Proof. It follows from (57) thatThus, for any integer , we obtainand consequently, we obtain assertion (64).

Recall that is convex and closed under our assumptions (see, Theorem 2.3.5 in [30]). LetFor any given , inequality (64) indicates that Algorithm G-ADMM-S requires at most iterations to fulfill the requirement .

5. Numerical Results

To investigate the numerical performance of the proposed algorithm, we apply it to solve a convex quadratic programming and a nonlinear convex programming with separable structure and report some preliminary numerical results. All codes were written by Matlab 2016a, and all the numerical experiments were conducted on a Dell desktop computer with Intel Pentium Intel (R) Core processor 3.30 GHz and 4 GB memory.

5.1. Quadratic Programming Problem

First, we consider the following quadratic programming problem:In the experiments, we set and , (). We set the matrix and construct the rest of matrices in a way similar to [25, 31]. That is, , where are random matrices andIn our tests, we set and generate the matrices in Matlab function. In the experiments, we set and the radius . For the linear constraint, the entries of () are uniformly distributed in with the density 0.1 and . ’s are given generated by () in the Matlab function. In order to guarantee the feasibility of the problem, we set , () and . Thus, is an optimal solution of (68). The Algorithm G-ADMM-S is compared with the PPSM-C in [25]. The initial iteration points are the zero vectors () and for all tested algorithms. We set a maximal number of 20000 for iteration of the proposed algorithms with a modified stopping criterion as follows:

Now, we specify the choices of parameters to implement these algorithms. First, we set with and the relaxation parameter for all tested algorithms. For “PPSM-C″, we set (), where represents the Frobenius norm. For G-ADMM-S, we consider two cases of the matrices (): Case 1: with Case 2: with

In order to investigate the stability and efficiency of our algorithms, we test 16 groups of problems with random data. Some preliminary numerical results are reported in Table 1. Since they are synthetic examples with random date, for each scenario, we test 10 times and report the average performance. Specifically, we report the number of iterations (“Iter.”) and the computing time in seconds (“Time”) for all the tested methods. The data in Table 1 show that Algorithm G-ADMM-S ( with ) is more efficient than the rest of algorithms for the test problems.

5.2. Nonlinear Convex Programming Problem

In this section, we consider the following nonlinear convex programming problem:where and is a positive matrix. In the experiments, we set . It is easy to see that, when the PPSM-C in [25] solves (71), there is no explicit solution to the subproblems. In this section, we only use the Algorithm G-ADMM-S to solve (71). In the experiments, the entries of () are uniformly distributed in with the density 0.1 and . , , and is given generated by in the Matlab function. We set the matrix in a way similar to [25, 31]. In order to guarantee the feasibility of the problem, we set and . Thus, is an optimal solution of (71). The initial iteration points are the zero vectors () and for all tested algorithms. We set a maximal number of 20000 for iteration of the proposed algorithms with a modified stopping criterion as follows:

Now, we specify the choices of parameters to implement these algorithms. We set with , the relaxation parameter , , , , and (). We consider two cases of the parameter : Case 1: ; Case 2: .

We test 7 groups of problems with random data. Numerical results are reported in Table 2. For each scenario, we test 5 times and report the average performance. Specifically, we report the number of iterations (“Iter.”), the computing time in seconds (“Time”), and the absolute error of function value (“f-error”). The numerical results show that Algorithm G-ADMM-S is effective.

6. Conclusion

In this paper, for the linearly constrained separable convex programming, whose objective function is the sum of individual blocks with nonoverlapping variables and each block is convex, we present a gradient-based ADMM with a substitution in the case . We have analysed its convergence and iteration complexity. The preliminary numerical results have shown the efficiency of the proposed algorithm.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The first author would like to extend his sincere gratitude to his doctoral supervisor, Pro. Caozong Cheng, for his guidance.

References

D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via finite element approximation,” Computers & Mathematics with Applications, vol. 2, no. 1, pp. 17–40, 1976.
View at: Publisher Site | Google Scholar
R. Glowinski and A. Marroco, “Sur l’approximation, par éléments finis d’ordre un, et la résolution, par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires,” Revue Française D'automatique, Informatique, Recherche Opérationnelle. Analyse Numérique, vol. 9, no. R2, pp. 41–76, 1975.
View at: Publisher Site | Google Scholar
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2010.
View at: Publisher Site | Google Scholar
R. Glowinski, “On alternating direction methods of multipliers: a historical perspective,” Computational Methods in Applied Sciences, vol. 34, pp. 59–82, 2014.
View at: Publisher Site | Google Scholar
M. Tao and X. M. Yuan, “Recovering low-rank and sparse components of matrices from incomplete and noisy observations,” SIAM Journal on Optimization, vol. 21, no. 1, pp. 57–81, 2011.
View at: Publisher Site | Google Scholar
M. K. Ng, P. Weiss, and X. M. Yuan, “Solving constrained total-variation image restoration and reconstruction problems via alternating direction methods,” SIAM Journal on Scientific Computing, vol. 32, no. 5, pp. 2710–2736, 2010.
View at: Publisher Site | Google Scholar
N. K. Bose and K. J. Boo, “High-resolution image reconstruction with multisensors,” International Journal of Imaging Systems and Technology, vol. 9, no. 4, pp. 294–304, 1998.
View at: Publisher Site | Google Scholar
M. Ng, F. Wang, and X. M. Yuan, “Fast minimization methods for solving constrained total-variation superresolution image reconstruction,” Multidimensional Systems and Signal Processing, vol. 22, no. 1–3, pp. 259–286, 2011.
View at: Publisher Site | Google Scholar
C. H. Rosa, Pathways of Economic Development in an Uncertain Environment: A Finite Scenario Approach to the U.S. Region under Carbon Emission Restrictions, WP-94-41, International Institute for Applied Systems Analysis, Laxenburg, Austria, 1994.
S. Setzer, G. Steidl, and T. Teuber, “Deblurring Poissonian images by split Bregman techniques,” Journal of Visual Communication and Image Representation, vol. 21, no. 3, pp. 193–199, 2010.
View at: Publisher Site | Google Scholar
V. Chandrasekaran, P. A. Parrilo, and A. S. Willsky, “Latent variable graphical model selection via convex optimization,” The Annals of Statistics, vol. 40, no. 4, pp. 1935–1967, 2012.
View at: Publisher Site | Google Scholar
G. J. McLachlan, Discriminant Analysis and Statistical Pattern Recognition, Wiley-Interscience, Hoboken, NJ, USA, 2004.
L. F. Yang, J. Y. Luo, Y. Xu, Z. R. Zhang, and Z. Y. Dong, “A distributed dual consensus ADMM based on partition for DC-DOPF with carbon emission trading,” IEEE Transactions on Industrial Informatics, vol. 16, no. 3, pp. 1858–1872, 2020.
View at: Publisher Site | Google Scholar
L. F. Yang, J. B. Jian, K. Meng, Y. Xu, and Z. Dong, “A novel projected two-binary-variable formulation for unit commitment in power systems,” Applied Energy, vol. 187, pp. 732–745, 2017.
View at: Publisher Site | Google Scholar
M. T. Chao and C. Z. Cheng, “A note on the convergence of alternating proximal gradient method,” Applied Mathematics and Computation, vol. 228, pp. 258–263, 2014.
View at: Publisher Site | Google Scholar
D. R. Han and X. M. Yuan, “A note on the alternating direction method of multipliers,” Journal of Optimization Theory and Applications, vol. 155, no. 1, pp. 227–238, 2012.
View at: Publisher Site | Google Scholar
M. Y. Hong and Z. Q. Luo, “On the linear convergence of the alternating direction method of multipliers,” Mathematical Programming, vol. 162, no. 1-2, pp. 165–199, 2017.
View at: Publisher Site | Google Scholar
C. H. Chen, B. S. He, Y. Ye, and X. M. Yuan, “The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent,” Mathematical Programming, vol. 155, no. 1-2, pp. 57–79, 2016.
View at: Publisher Site | Google Scholar
M. Li, D. F. Sun, and K. C. Toh, “A convergent 3-block semi-proximal ADMM for convex minimization problems with one strongly convex block,” Asia-Pacific Journal of Operational Research, vol. 32, p. 19, 2015.
View at: Publisher Site | Google Scholar
D. F. Sun, K. C. Toh, and L. Q. Yang, “A convergent 3-block semiproximal alternating direction method of multipliers for conic programming with 4-type constraints,” SIAM Journal on Optimization, vol. 25, no. 2, pp. 882–915, 2015.
View at: Publisher Site | Google Scholar
B. S. He, M. Tao, and X. M. Yuan, “Alternating direction method with Gaussian back substitution for separable convex programming,” SIAM Journal on Optimization, vol. 22, no. 2, pp. 313–340, 2012.
View at: Publisher Site | Google Scholar
M. T. Chao, C. Z. Cheng, and D. Y. Liang, “A proximal block minimization method of multipliers with a substitution procedure,” Optimization Methods and Software, vol. 30, no. 4, pp. 825–842, 2015.
View at: Publisher Site | Google Scholar
M. T. Chao, C. Z. Cheng, and H. B. Zhang, “A linearized alternating direction method of multipliers with substitution procedure, Asia Pac,” Journal of Operations Research, vol. 32, p. 19, 2015.
View at: Publisher Site | Google Scholar
D. R. Han, X. M. Yuan, W. X. Zhang, and X. J. Cai, “An ADM-based splitting method for separable convex programming,” Computational Optimization and Applications, vol. 54, no. 2, pp. 343–369, 2013.
View at: Publisher Site | Google Scholar
D. R. Han, H. J. He, and L. L. Xu, “A proximal parallel splitting method for minimizing sum of convex functions with linear constraints,” Journal of Computational and Applied Mathematics, vol. 256, pp. 36–51, 2014.
View at: Publisher Site | Google Scholar
B. S. He, M. Tao, and X. M. Yuan, “A splitting method for separable convex programming,” IMA Journal of Numerical Analysis, vol. 34, pp. 1–33, 2014.
View at: Google Scholar
P. L. Lions and B. Mercier, “Splitting algorithms for the sum of two nonlinear operators,” SIAM Journal on Numerical Analysis, vol. 16, no. 6, pp. 964–979, 1979.
View at: Publisher Site | Google Scholar
G. B. Passty, “Ergodic convergence to a zero of the sum of monotone operators in Hilbert space,” Journal of Mathematical Analysis and Applications, vol. 72, no. 2, pp. 383–390, 1979.
View at: Publisher Site | Google Scholar
E. Blum and W. Oettli, Mathematische Optimierung, Econometrics and Operations Research XX, Springer-Verlag, New York, NY, USA, 1975.
F. Facchinei and J. S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. I, Springer Series in Operations Research, Springer Verlag, New York, NY, USA, 2003.
P. T. Harker and J. S. Pang, “A damped-Newton method for the linear complementarity problem,” in Computational Solution of Nonlinear Systems of Equations, E. L. Allgower and K. Georg, Eds., pp. 265–284, American Math. Soc., Providence, RI, USA, 1990.
View at: Google Scholar

Copyright

Copyright © 2020 Miantao Chao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies