A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Wu, Xiaodi; Zhu, Yihan; Yin, Jianghua

doi:https://doi.org/10.1155/2021/2087438

Computational Intelligence and Neuroscience

On this page

Abstract Introduction Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence and Machine Learning-Driven Decision-Making

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 2087438 | https://doi.org/10.1155/2021/2087438

A HS-PRP-Type Hybrid Conjugate Gradient Method with Sufficient Descent Property

Xiaodi Wu,¹Yihan Zhu,¹and Jianghua Yin²

Academic Editor: Ahmed Mostafa Khalil

Received05 Sept 2021

Revised24 Sept 2021

Accepted04 Oct 2021

Published22 Oct 2021

Abstract

In this paper, based on the HS method and a modified version of the PRP method, a hybrid conjugate gradient (CG) method is proposed for solving large-scale unconstrained optimization problems. The CG parameter generated by the method is always nonnegative. Moreover, the search direction possesses the sufficient descent property independent of line search. Utilizing the standard Wolfe–Powell line search rule to yield the stepsize, the global convergence of the proposed method is shown under the common assumptions. Finally, numerical results show that the proposed method is promising compared with two existing methods.

1. Introduction

Consider the problem of minimizing over :where is continuously differentiable. Throughout, the gradient of at is denoted by , i.e., . We know that conjugate gradient (CG) methods are very popular and effective for solving unconstrained optimization problems (1), especially for large-scale case by means of their simplicity and low memory requirements. These preferred features greatly promote their applications in various areas such as image deblurring and denoising, neural network, compressed sensing, and others. We refer the interested readers to some recent works [1–3] and references therein for more details. The numerical results reported in [1] reveal that the CG method has great potential in solving image restoration problems.

Generally, the iterative formula of the CG method for solving problem (1) can be read aswhere is called the stepsize computed by some line search. Here, is commonly known as the search direction, which is defined as follows:where is the so-called CG parameter and is the abbreviation of , i.e., . The two key factors that affect the numerical performance of the CG method are the stepsize and the CG parameter. First, we outline several well-known line search criteria in the literature.(a)The exact line search rule: calculate a stepsize satisfying(b)The standard (weak) Wolfe–Powell (WWP) line search rule: calculate a stepsize satisfying and where .(c)The strong Wolfe–Powell (SWP) line search rule: calculate a stepsize satisfying (5) and

On the other hand, different CG methods are determined by different CG parameters. The well-known CG methods include the Fletcher–Reeves (FR) [4], Polak–Ribière–Polyak (PRP) [5, 6], Hestenes–Stiefel (HS) [7], Liu–Storey (LS) [8], Fletcher (CD) [9], and Dai–Yuan (DY) [10] methods, and their CG parameters are, respectively, given bywhere and stands for the Euclidean norm. The methods yielded by the above CG parameters are called the classical CG methods, and their convergence analysis and numerical performance have been extensively studied (see, e.g., [4–12]). It has been shown that the above formulas for the CG parameters are equivalent when is convex quadratic and the stepsize is obtained by carrying out the exact line search rule (4). However, their numerical performance strongly depends on the CG parameter . The FR, CD, and DY methods possess good convergence, but the numerical performance for these methods is somewhat unsatisfactory for solving general unconstrained nonlinear optimization problems [12–14]. On the contrary, it has been shown that the convergence properties of PRP, HS, and LS methods are not so well, but they often possess better computational performance [12–14]. Therefore, in the past few decades, based on the above formulas, plenty of formulas for are designed for CG methods that possess both good global convergence properties and promising numerical performance (see [12–16] and references therein).

To our knowledge, the first hybrid CG method in the literature was proposed by Touati-Ahmed and Storey [17] (TS method), where is computed as

Apparently, the TS method has some good properties of FR and PRP methods since is a hybrid of and . Combined with HS and DY methods, Dai and Yuan [18] proposed another hybrid CG method (hHD method), in which the hybrid CG parameter is obtained by

When the WWP line search rule is used to compute the stepsize, the resulting search direction in [18] is a descent one and the global convergence for the hHD method is proved. Moreover, the numerical experiments reported in [18] illustrated that the hHD method is competitive and practicable. For other closely related works, we refer the readers to [18, 19] and the references therein. It is worth noting that the CG parameters defined in [17–19] are restricted to positive values. As explicated in [19], this restriction in turn results in global convergence of the algorithm. In recent years, many hybrid CG methods were proposed on the basis of the methodology of discrete combinations of several CG parameters (see, e.g., [1, 13, 20–23]). The combination parameter is computed by some secant equations [13, 20], the conjugacy condition [21, 22], or by minimizing the least-squares problem consisting of the unknown search direction and an existing one (see [23] and the references therein).

In 2016, Wei et al. [24] introduced a modified PRP method, usually called the WYL method, where the corresponding parameter is yielded by

Under the assumption that generated by Wei et al. [24] satisfies the so-called sufficient descent condition the WYL method is globally convergent under the WWP line search rule and possesses superior numerical performance. Subsequently, Dai and Wen [25] proposed two improved CG methods with sufficient descent property. The CG parameters in [25] are defined aswhere . Clearly, the search direction yielded by satisfies the sufficient descent condition without depending on any line search. However, the sufficient descent property associated with relies on the WWP line search rule.

Based on the above observations, it is interesting to design a hybrid CG method such that the CG parameter is nonnegative and the resulting search direction possesses the sufficient descent property independent of line search technique. Motivated by the methods in [24, 25] and considering that the HS method performs best among the classical CG methods, a new formula for the CG parameter is given bywhere . It is not difficult to see that is a hybrid of , , and . Interestingly, the above parameter is always nonnegative. To see this, let be the angle between and . Thus, we know from (14) thatwhich further implies

Moreover, plugging the CG parameter into (3), we can show that the resulting search direction possesses the sufficient descent property independent of line search technique (see Lemma 1 below).

The structure of this paper is organized as follows. In Section 2, our algorithm framework is presented, and the sufficient descent property with respect to the resulting search direction is discussed in detail. Section 3 is devoted to establishing the convergence of the proposed method with the WWP line search rule. In the last section, some preliminary numerical results are reported to verify the efficiency of the presented method.

2. The Algorithm

In this section, we first propose the algorithm framework for solving problem (1), in which we do not specify which line search rule generates the stepsize. Subsequently, we analyze the sufficient descent property for the search direction. By inserting the WWP line search rule into the algorithm framework, our hybrid CG method is proposed.

The following lemma shows that the direction sequence generated by Algorithm 1 possesses the sufficient descent property independent of any line search.

	Step 0: (initialization) given the initial point , . Set and .
	Step 1: stop if .
	Step 2: compute the stepsize by an appropriate line search rule.
	Step 3: generate the new iteration point by and compute according to (14).
	Step 4: compute the next direction by (3), set , and go to Step 1.

Lemma 1. Let be a sequence generated by Algorithm 1. Then, for some constant , it holds that

Proof. When , it follows from the definition of in (3) that . So, the relation in (17) holds when . Now, consider the case . If , it follows from (3) thatSuppose that for all . It then follows from (3), (15), and (16) thatwhich completes the proof.

For convenience, in the following statements, we call the method generated by Algorithm 1 with the WWP line search rule as the hHPR CG method.

3. Convergence

In this section, we analyze the convergence for the hHPR CG method. For this goal, the following common assumptions are necessary.

Assumption 1. (i)The level set is bounded. Here, is the given initial point.(ii)In some neighborhood of the level set , the objective function is continuously differentiable, and its gradient is Lipschitz continuous, i.e., there exists a constant such that

The following lemma provides the convergence for the PRP-type CG method, which was originally introduced in [19].

Lemma 2. Consider the general CG method (2) and (3) with the following three properties:(i)The CG parameter is always nonnegative, i.e., for all .(ii)The line search satisfies (5) and (6) and the sufficient descent condition.(iii)Property holds. Then,

Property 1. Consider a method of forms (2) and (3). Suppose thatWe say that the method has property , if for all , there exist constants and such that , and if where , then we have .
From (16) and Lemmas 1 and 2, to obtain the global convergence of the hHPR CG method, we only prove that our method owns property .

Lemma 3. Consider the method of forms (2) and (3) in which . If Assumption 1 holds, then satisfies property .

Proof. Considering the method of forms (2) and (3) and using the constants and in (22), we have from (16) thatLet and . If , we obtain from Assumption 1(ii) and (15) thatTherefore, the proof is completed.

With (16) and Lemmas 1–3 at hand, one can establish the global convergence of the hHPR CG method.

Theorem 1. Let be a sequence generated by the hHPR CG method. If Assumption 1 holds, then .

4. Numerical Experiments

In this section, we verify the efficiency and robustness of the hHPR CG method (hHPR for short) by solving some classical tested problems and compare it with two well-known CG methods: DHS and DPRP in [25].

For the tested problems, some of them are from the well-known CUTE library in [26] and the others come from [27]. Moreover, their dimensions range from 2 to 1000000. All codes were written in MATLAB R2016a, and the numerical experiments were conducted on a Dell PC with Intel Core CPU 3.00 GHz and 16.00 GB RAM. For the aforementioned methods, we reset the search direction by taking once an ascent direction occurs. For the sake of fairness, all the stepsizes are yielded by the WWP line search rule following a bisection algorithm proposed in [28], and the corresponding parameters are set to and . Moreover, we adopt the strategy described in [29] to compute the initial stepsize.

Let for hHPR, and let for DHS and DPRP. Denote the iteration numbers, the CPU time in seconds, and the final value of by , and , respectively. If or , we stop the program. If the latter requirement holds, i.e., , we use “-” to denote Itr, Tcpu, and .

The numerical results are listed in Tables 1 and 2, where “TP” denotes the tested problems used in numerical experiments and “Dim” stands for the dimension of the tested problems.

As we all know, the performance profile introduced in [30] is very useful in measuring the performance of numerical algorithms. Figures 1 and 2 plot the performance profiles of hHPR, DHS, and DPRP in terms of Itr and Tcpu, respectively. Based on the left side of Figures 1 and 2, the proposed method is clearly above the other two curves, and this in turn shows that compared with DHS and DPRP, our proposed method is efficient and encouraging. On the other hand, based on the right side of Figures 1 and 2, our proposed method can successfully solve about 90% of the tested problems and clearly outperforms the other two methods.

Data Availability

All the datasets used in this paper are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The corresponding author acknowledges the Natural Science Foundation of Guangxi Province (grant no. 2021GXNSFAA075001).

References

A. Alhawarat, Z. Salleh, and I. A. Masmali, “A convex combination between two different search directions of conjugate gradient method and application in image restoration,” Mathematical Problems in Engineering, vol. 2021, Article ID 9941757, 15 pages, 2021.
View at: Google Scholar
A. Alhawarat, G. Alhamzi, I. Masmali, and Z. Salleh, “A descent four-term conjugate gradient method with global convergence properties for large-scale unconstrained optimisation problems,” Mathematical Problems in Engineering, vol. 2021, Article ID 6219062, 14 pages, 2021.
View at: Google Scholar
I. A. Masmali, Z. Salleh, Z. Salleh, and A. Alhawarat, “A decent three term conjugate gradient method with global convergence properties for large scale unconstrained optimization problems,” AIMS Mathematics, vol. 6, no. 10, pp. 10742–10764, 2021.
View at: Publisher Site | Google Scholar
R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients,” The Computer Journal, vol. 7, no. 2, pp. 149–154, 1964.
View at: Publisher Site | Google Scholar
E. Polak and G. Ribière, “Note sur la convergence de méthodes de directions conjuguées,” Revue française d'informatique et de recherche opérationnelle. Série rouge, vol. 3, no. 16, pp. 35–43, 1969.
View at: Publisher Site | Google Scholar
B. T. Polyak, “The conjugate gradient method in extremal problems,” USSR Computational Mathematics and Mathematical Physics, vol. 9, no. 4, pp. 94–112, 1969.
View at: Publisher Site | Google Scholar
M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” Journal of Research of the National Bureau of Standards, vol. 49, no. 6, pp. 409–436, 1952.
View at: Publisher Site | Google Scholar
Y. Liu and C. Storey, “Efficient generalized conjugate gradient algorithms, part 1: Theory,” Journal of Optimization Theory and Applications, vol. 69, no. 1, pp. 129–137, 1991.
View at: Publisher Site | Google Scholar
R. Fletcher, “Practical methods of optimization,” Unconstrained Optimization, John Wiley & Sons, New York, NY, USA, 1987.
View at: Google Scholar
Y. H. Dai and Y. Yuan, “A nonlinear conjugate gradient method with a strong global convergence property,” SIAM Journal ohn Optimization, vol. 10, no. 1, pp. 177–182, 1999.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. Yuan, Nonlinear Conjugate Gradient Methods (In Chinese), Shanghai Scientific and Technical Publishers, Shanghai, China, 2000.
J. K. Liu and S. J. Li, “New hybrid conjugate gradient method for unconstrained optimization,” Applied Mathematics and Computation, vol. 245, pp. 36–43, 2014.
View at: Publisher Site | Google Scholar
S. Babaie-Kafaki and R. Ghanbari, “Two hybrid nonlinear conjugate gradient methods based on a modified secant equation,” Optimization, vol. 63, no. 7, pp. 1027–1042, 2014.
View at: Publisher Site | Google Scholar
J. Jian, L. Han, and X. Jiang, “A hybrid conjugate gradient method with descent property for unconstrained optimization,” Applied Mathematical Modelling, vol. 39, no. 3-4, pp. 1281–1290, 2015.
View at: Publisher Site | Google Scholar
M. Sun and J. Liu, “Three modified Polak-Ribière-Polyak conjugate gradient methods with sufficient descent property,” Journal of Inequalities and Applications, vol. 2015, no. 1, pp. 125–138, 2015.
View at: Publisher Site | Google Scholar
I. Arzuka, M. R. Abu Bakar, and W. J. Leong, “A scaled three-term conjugate gradient method for unconstrained optimization,” Journal of Inequalities and Applications, vol. 2016, no. 1, pp. 325–340, 2016.
View at: Publisher Site | Google Scholar
D. Touati-Ahmed and C. Storey, “Efficient hybrid conjugate gradient techniques,” Journal of Optimization Theory and Applications, vol. 64, no. 2, pp. 379–397, 1990.
View at: Publisher Site | Google Scholar
Y. H. Dai and Y. Yuan, “An efficient hybrid conjugate gradient method for unconstrained optimization,” Annals of Operations Research, vol. 103, pp. 33–47, 2001.
View at: Google Scholar
J. C. Gilbert and J. Nocedal, “Global convergence properties of conjugate gradient methods for optimization,” SIAM Journal on Optimization, vol. 2, no. 1, pp. 21–42, 1992.
View at: Publisher Site | Google Scholar
N. Andrei, “Another hybrid conjugate gradient algorithm for unconstrained optimization,” Numerical Algorithms, vol. 47, no. 2, pp. 143–156, 2008.
View at: Publisher Site | Google Scholar
N. Andrei, “Hybrid conjugate gradient algorithm for unconstrained optimization,” Journal of Optimization Theory and Applications, vol. 141, no. 2, pp. 249–264, 2009.
View at: Publisher Site | Google Scholar
S. S. Djordjević, “New hybrid conjugate gradient method as a convex combination of LS and FR methods,” Acta Mathematica Scientia, vol. 39, pp. 214–228, 2019.
View at: Google Scholar
S. Babaie-Kafaki and R. Ghanbari, “A hybridization of the Hestenes-Stiefel and Dai-Yuan conjugate gradient methods based on a least-squares approach,” Optimization Methods and Software, vol. 30, no. 4, pp. 673–681, 2015.
View at: Publisher Site | Google Scholar
Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient methods,” Applied Mathematics and Computation, vol. 183, no. 2, pp. 1341–1350, 2006.
View at: Publisher Site | Google Scholar
Z. Dai and F. Wen, “Another improved Wei-Yao-Liu nonlinear conjugate gradient method with sufficient descent property,” Applied Mathematics and Computation, vol. 218, no. 14, pp. 7421–7430, 2012.
View at: Publisher Site | Google Scholar
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “Cute,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar
J. J. Moré, B. S. Garbow, and K. E. Hillstrom, “Testing unconstrained optimization software,” ACM Transactions on Mathematical Software, vol. 7, no. 1, pp. 17–41, 1981.
View at: Publisher Site | Google Scholar
J. V. Burke and A. Engle, “Line search methods for convex-composite optimization,” 2018, https://arxiv.org/abs/1806.05218.
View at: Google Scholar
B. Sellami, Y. Laskri, and R. Benzine, “A new two-parameter family of nonlinear conjugate gradient methods,” Optimization, vol. 64, no. 4, pp. 993–1009, 2015.
View at: Publisher Site | Google Scholar
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Xiaodi Wu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies