A Novel Value for the Parameter in the Dai-Liao-Type Conjugate Gradient Method

Ivanov, Branislav; Stanimirović, Predrag S.; Shaini, Bilall I.; Ahmad, Hijaz; Wang, Miao-Kun

doi:https://doi.org/10.1155/2021/6693401

Journal of Function Spaces

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Special Issue

Approximation Methods: Theory and Applications

View this Special Issue

Research Article | Open Access

Volume 2021 | Article ID 6693401 | https://doi.org/10.1155/2021/6693401

A Novel Value for the Parameter in the Dai-Liao-Type Conjugate Gradient Method

Branislav Ivanov,¹Predrag S. Stanimirović,²Bilall I. Shaini,³Hijaz Ahmad,⁴and Miao-Kun Wang⁵

Academic Editor: Ioan Rasa

Received29 Oct 2020

Revised28 Nov 2020

Accepted15 Jan 2021

Published30 Jan 2021

Abstract

A new rule for calculating the parameter involved in each iteration of the MHSDL (Dai-Liao) conjugate gradient (CG) method is presented. The new value of the parameter initiates a more efficient and robust variant of the Dai-Liao algorithm. Under proper conditions, theoretical analysis reveals that the proposed method in conjunction with backtracking line search is of global convergence. Numerical experiments are also presented, which confirm the influence of the new value of the parameter on the behavior of the underlying CG optimization method. Numerical comparisons and the analysis of obtained results considering Dolan and Moré’s performance profile show better performances of the novel method with respect to all three analyzed characteristics: number of iterative steps, number of function evaluations, and CPU time.

1. Introduction and Background Results

The topic of our research is solving the unconstrained nonlinear optimization problemwhere the function is continuously differentiable and bounded below. Following the standard notation, denotes the gradient, and . Using an extended conjugacy conditionDai and Liao in [1] proposed the conjugate gradient (CG) methodwhere the step size is a positive parameter, is an already generated point, is a new iterative point, and is a suitable search direction. The search directions are generated by the conceptual formulawhere the conjugate gradient coefficient is defined bywherein is a scalar.

Some well-known formulas for defining have been created by modifying the conjugate gradient parameter [2–9]. One of them is denoted as and defined in [7] bywhere is a scalar as in (5) and .

The family of CG methods for nonlinear optimization has reached great popularity lately, thanks to the various benefits and advantages it possesses. The most important property is based on computationally efficient iterations arising from a simple CG rule. This property initiates the high efficiency of CG methods with respect to analogous methods for nonlinear optimization. Moreover, global convergence is ensured under suitable conditions. Finally, the application of various CG methods in solving image restoration problems has become an important research topic [10, 11].

Since the parameter is important for the numerical behavior of Dai-Liao (DL) CG methods [12], one of the most important problems in the implementation of the DL class CG method is to determine a proper value which will give desirable results. Many scientists have invested a lot of time and effort in the previous period to determine the best definition of the nonnegative parameter in the DL class CG methods. So far, the research in finding the appropriate value of has evolved in two directions. One group of methods is aimed at finding an appropriate fixed value for [1, 2, 6–8], while methods from another group promote appropriate rules for computing values of in each iteration, which ensure a satisfactory decrease of the objective. In our research, we will pay attention to the second research stream: find the parameter whose values change through iterations so that the faster convergence is achieved. The value of the parameter defined in the th iteration will be denoted by .

In order to complete the presentation, we will restate the main principles proposed so far for computing . Hager and Zhang in [13, 14] proposed the DL CG method (5), known as CG-DESCENT, where is defined by

Dai and Kou [15] suggested the conjugate gradient coefficient of the formwhere is the scaling parameter arising from the self-scaling memoryless BFGS method. Clearly, the Dai and Kou (DK) method is a member of the DL class CG methods, which is determined by

The results given in [15] confirm that the DK iterations outperform many existing CG methods. Following the development of DL methods, Babaie-Kafaki and Ghanbari [16] defined two new ways to calculate the value of the parameter in (5), as in the following two formulas:

Andrei in [17] proposed the new rule for calculating in order to define in (5) and defined a new variant of the DL class CG methods, denoted by DLE, with

Lotfi and Hosseini in [18] proposed the following rule for determining the parameter , using the expressionwhereand , , and are three positive constants.

On the basis of the above overview of the main CG methods and motivated by the strong theoretical properties and computational efficiency of modified Dai-Liao CG methods proposed by many researchers, we suggest a new way of calculating the value of the parameter . As a consequence, the corresponding CG method of DL type, termed as the Effective Dai-Liao (EDL) method, is proposed and its convergence is proven. Numerical testing and comparison with other known DL variants are presented in order to show the effectiveness of the introduced method. Analysis of generated numerical results exhibits that the proposed EDL method is efficient compared with other DL-type methods.

The global organization of sections is described as follows. Introduction, motivation, and a brief overview of the preliminary results are given in Section 1. A new rule for calculating the variable parameter is proposed in Section 2. An effective algorithm and global convergence of the EDL method initiated by are given in the same section. The new EDL method is tested in Section 3 on some unlimited optimization test problems and compared against some known variants of the DL class methods. Finally, concluding remarks are presented in the last concluding section.

2. A Modified Dai-Liao Method and Its Convergence

Popularity in defining new rules for calculating is a guarantee that such an approach is effective and still insufficiently explored. The idea for defining a new parameter comes from previously described rules for computing , particularly from the paper Li and Ruan [19] and from the idea which can be found in the paper Yuan et al. [11]. Further, analyzing the results from [1, 2, 6–8], we conclude that the scalar was defined by a fixed value of in related numerical experiments. Also, numerical experience related to the fixed valued was reported in [1]. According to this experience, our intention is to define variable values inside the interval .

To successfully define with values belonging to the interval , let us start from the definition of the quantity which was used in defining the direction in [19]. The parameter was defined by , where

By putting into , the following can be obtained:

Further, with certain modifications and substitutions in the equation defining , as well as using the function , which chooses the maximum between the value of the expression and , we come to a new definition of the parameter . As described in advance imposed desired restrictions, the novel parameter is defined by

It is easy to verify that defined by (16) satisfies

Accordingly, , which was our initial intention. Clearly, greater values of lead to values . Further, since the trend is expectable, we can expect smaller values in late iterations. Therefore, is suitable for defining corresponding conjugate gradient coefficient or and further DL CG iterations (4).

Considering in (6), it is reasonable to propose a novel variant of the Dai-Liao CG parameter which is subject to the following rule during the iterative process:

Before the main algorithm, it is necessary to define the backtracking line search as one of the most popular and practical methods for computing the step length in (3). The procedure for the backtracking line search proposed in [20] starts from the initial value and generates output values which ensure that the goal function decreases in each iteration. Consequently, it is appropriate to use Algorithm 1, restated from [21], in order to determine the primary step size .

The backtracking line search.
Require: Nonlinear objective function , search direction , previous point , and real quantities and .
1: .
2: While , do .
3: Return .

Algorithm 1:

Algorithm 2 describes a computational framework for the EDL method.

It is necessary to examine the properties of the EDL method and prove its convergence.

Assumption 1. (1)The level set , defined upon the initial point of the iterative method (3), is bounded.(2)The goal function is continuous and differentiable in a neighborhood of with the Lipschitz continuous gradient . This assumption implies the existence of a positive constant satisfying

Assumption 1 initiates the existence of positive constants and satisfying

The conditions from Assumption 1 are assumed. In view of the uniform convexity of , there is a constant that satisfiesor equivalently,

Effective Dai-Liao (EDL) CG method.
Require: An initial point and quantities , .
1: Assign and .
2: If
and ,
STOP;
else go to Step 3.
3: Calculate using Algorithm 1 (backtracking line search).
4: Compute .
5: Calculate , , .
6: Compute by (16).
7: Calculate by (18).
8: Compute .
9: Let , and go to Step 2.

Algorithm 2:

It follows from (21) and (22) that

By (19) and (23), one concludeswhere the inequality implies .

The inequality (25) initiates

Taking into account and the last inequality, we conclude

Lemma 2. [22, 23]. Let Assumption 1 be accomplished and the points be generated by the method (3)–(4). Then, it holds

Lemma 3. Consider the proposed Dai-Liao CG method, including (3), (4), and (18). If the search procedure guarantees (27), for all , then the next inequality holdsfor some .

Proof. The inequality (29) will be verified by induction. In the initial situation , one obtains . Since , obviously (29) is satisfied in the basic case. Suppose that (29) is valid for some . Taking the inner product of both the left- and right-hand sides in (4) with the vector , the following can be obtained:Using (17) in common with (27) and , we concludeNow from (30), (31), andit follows thatIn view of , the inequality (29) is satisfied for in (33) and arbitrary .

The global convergence of the proposed EDL method is confirmed by Theorem 4.

Theorem 4. Let Assumption 1 be true and be uniformly convex. Then, the sequence generated by (3), (4), and (18) fulfills

Proof. Suppose the opposite, i.e., (34) is not true. This implies the existence of a constant such thatSquaring both sides of (4) impliesTaking into account (18), we can getNow from (31) and (32), it follows thatNow, an application of (18) initiatesUsing and (38) and (39) in (36), we obtainNext, dividing both sides of (40) by and using (35), it can be concluded thatThe inequalities in (41) implyTherefore, causes a contradiction with Lemma 2.

3. Numerical Experiments

The implementation of the EDL method is based on Algorithm 2. This section is intended to analyze and compare the numerical results obtained by the EDL method and four variants of the MHSDL class methods (6). These variants are defined by , , , and and denoted, respectively, as MHSDL3, MHSDL4, MHSDL5, and MHSDL6. The obtained results are not compared with the values and , because in [16], the authors have already shown that and initiate better numerical performances compared to and .

The codes used in the testing experiments for the above methods are written in MATLAB R2017a and executed on the Intel Core i3 2.0 GHz workstation with the Windows 10 operating system. Three important criteria are analyzed in each individual test case: number of iterations (NI), number of function evaluations (NFE), and processor time (CPU).

The numerical experiment is performed using 28 test functions presented in [24], where much of the problems are taken over from the CUTEr collection [25]. All methods used in the testing of an arbitrary objective function start from the same initialization . Each function is tested 10 times with gradually increasing dimensions , 500, 1000, 3000, 5000, 7000, 8000, 10000, 15000, and 20000.

The uniform terminating criteria for each of the five considered algorithms (EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6) arewhere and . The backtracking line search is based on the parameters and for all five algorithms. Specific parameters used only in the MHSDL6 method are defined as , , and .

Summary numerical results for EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods, executed on 28 test functions, are arranged in Tables 1–3. Tables 1–3 show the numerical outcomes corresponding to all three criteria (NI, NFE, and CPU) for the EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods.

We utilized the performance profile given in [26] to compare numerical results for three criteria (NI, NFE, and CPU) generated by five methods (EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6). The upper curve of the selected performance profile corresponds to the method that shows the best performance.

Figures 1–3 plot the performance profiles for the numerical values included in Tables 1–3, respectively. Figure 1 presents the performance profiles of the NI criterion generated by the EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods. In this figure, it is noticeable that EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods solved all tested functions, wherein the EDL method shows the best performances in 57.14% of test functions compared with MHSDL3 (25.00%), MHSDL4 (0.00%), MHSDL5 (0.00%), and MHSDL6 (17.86%). From Figure 1, it is observable that the graph of the EDL method comes first to the top, which means that the EDL outperforms other considered methods with respect to the NI.

Figure 2 presents the performance profiles of the NFE of the EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods. It is observable that EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 generated solutions to all tested cases, and the EDL method is the best in 67.86% of the functions compared with MHSDL3 (17.86%), MHSDL4 (0.00%), MHSDL5 (0.00%), and MHSDL6 (14.28%). From Figure 2, it is observed that the EDL graph first comes to the top, which confirms that the EDL is the winner with respect to the NFE.

Figure 3 contains graphs of the performance profiles corresponding to the CPU time of the EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 methods. It is obvious that EDL, MHSDL3, MHSDL4, MHSDL5, and MHSDL6 solved all tested functions. Further analysis gives that the EDL method is the winner in 67.86% of the test cases compared with MHSDL3 (17.86%), MHSDL4 (0.00%), MHSDL5 (0.00%), and MHSDL6 (14.28%). Figure 3 demonstrates that the graph of the EDL method first comes to level 1, which indicates its superiority with respect to the CPU time.

From the previous analysis of the results shown in Tables 1–3 and Figures 1–3, it can be concluded that the EDL method produces superlative results in terms of all three basic metrics: NI, NFE, and CPU.

4. Conclusion

A novel rule which determines the value of the parameter in each iteration of the Dai-Liao-type CG method is presented. The proposed expression for defining is denoted by . Considering in (6), a novel variant of the Dai-Liao CG parameter is defined and a novel Effective Dai-Liao (EDL) conjugate gradient method is proposed. The convergence of the EDL method is investigated, and the global convergence on a class of uniformly convex functions is established. By numerical testing, we have shown that there is a significant influence of the scalar size of on the convergence speed of the EDL method. Numerical comparisons on large-scale unconstrained optimization test functions of different structures and complexities confirm the computational efficiency of the algorithm EDL and its superiority over the previously known DL CG variants, such as MHSDL3, MHSDL4, MHSDL5, and MHSDL6. During the testing, we tracked the number of iterations (NI), number of function evaluations (NFE), and spanned processor time (CPU) performances for each function and each method. Analysis of the obtained performance profiles introduced by Dolan and Moré revealed that the EDL method is the most efficient.

We are convinced that the obtained results will be a motivation for further research in defining new values of the parameter in the Dai-Liao CG methods. Future research would include research in finding some more efficient rules to calculate the parameter during the iterative process. We hope that our proposal of the new expression for defining the parameter will initiate further research in that direction. It is evident that finding novel approaches in defining different values of and the conjugate gradient parameter is an inexhaustible topic for scientific research, and our approach is only one possible direction in this research.

Data Availability

Data will be provided on request to the first author.

Conflicts of Interest

The authors declare no conflict of interest.

Acknowledgments

The research was supported by the National Natural Science Foundation of China (Grant Nos. 11971142, 11871202, 61673169, 11701176, 11626101, and 11601485).

References

Y. -H. Dai and L. -Z. Liao, “New conjugacy conditions and related nonlinear conjugate gradient methods,” Applied Mathematics and Optimization, vol. 43, no. 1, pp. 87–101, 2001.
View at: Publisher Site | Google Scholar
Y. Cheng, Q. Mou, X. Pan, and S. Yao, “A sufficient descent conjugate gradient method and its global convergence,” Optimization Methods and Software, vol. 31, no. 3, pp. 577–590, 2016.
View at: Publisher Site | Google Scholar
I. E. Livieris and P. Pintelas, “A descent Dai-Liao conjugate gradient method based on a modified secant equation and its global convergence,” ISRN Computational Mathematics, vol. 2012, Article ID 435495, 8 pages, 2012.
View at: Publisher Site | Google Scholar
M. R. Peyghami, H. Ahmadzadeh, and A. Fazli, “A new class of efficient and globally convergent conjugate gradient methods in the Dai-Liao family,” Optimization Methods and Software, vol. 30, no. 4, pp. 843–863, 2015.
View at: Publisher Site | Google Scholar
H. Yabe and M. Takano, “Global convergence properties of nonlinear conjugate gradient methods with modified secant condition,” Computational Optimization and Applications, vol. 28, no. 2, pp. 203–225, 2004.
View at: Publisher Site | Google Scholar
S. Yao and B. Qin, “A hybrid of DL and WYL nonlinear conjugate gradient methods,” Abstract and Applied Analysis, vol. 2014, Article ID 279891, 9 pages, 2014.
View at: Publisher Site | Google Scholar
S. Yao, X. Lu, and Z. Wei, “A conjugate gradient method with global convergence for large-scale unconstrained optimization problems,” Journal of Applied Mathematics, vol. 2013, Article ID 730454, 9 pages, 2013.
View at: Publisher Site | Google Scholar
Y. Zheng and B. Zheng, “Two new Dai-Liao-type conjugate gradient methods for unconstrained optimization problems,” Journal of Optimization Theory and Applications, vol. 175, no. 2, pp. 502–509, 2017.
View at: Publisher Site | Google Scholar
W. Zhou and L. Zhang, “A nonlinear conjugate gradient method based on the MBFGS secant condition,” Optimization Methods and Software, vol. 21, no. 5, pp. 707–714, 2006.
View at: Publisher Site | Google Scholar
W. Hu, J. Wu, and G. Yuan, “Some modified Hestenes-Stiefel conjugate gradient algorithms with application in image restoration,” Applied Numerical Mathematics, vol. 158, pp. 360–376, 2020.
View at: Publisher Site | Google Scholar
G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems,” Applied Numerical Mathematics, vol. 147, pp. 129–141, 2020.
View at: Publisher Site | Google Scholar
N. Andrei, “Open problems in nonlinear conjugate gradient algorithms for unconstrained optimization,” Bulletin of the Malaysian Mathematical Sciences Society, vol. 34, no. 2, pp. 319–330, 2011.
View at: Google Scholar
W. W. Hager and H. Zhang, “A new conjugate gradient method with guaranteed descent and an efficient line search,” SIAM Journal on Optimization, vol. 16, no. 1, pp. 170–192, 2005.
View at: Publisher Site | Google Scholar
W. W. Hager and H. Zhang, “Algorithm 851,” ACM Transactions on Mathematical Software, vol. 32, no. 1, pp. 113–137, 2006.
View at: Publisher Site | Google Scholar
Y. -H. Dai and C. -X. Kou, “A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search,” SIAM Journal on Optimization, vol. 23, no. 1, pp. 296–320, 2013.
View at: Publisher Site | Google Scholar
S. Babaie-Kafaki and R. Ghanbari, “The Dai-Liao nonlinear conjugate gradient method with optimal parameter choices,” European Journal of Operational Research, vol. 234, no. 3, pp. 625–630, 2014.
View at: Publisher Site | Google Scholar
N. Andrei, “A Dai-Liao conjugate gradient algorithm with clustering of eigenvalues,” Numerical Algorithms, vol. 77, no. 4, pp. 1273–1282, 2018.
View at: Publisher Site | Google Scholar
M. Lotfi and S. M. Hosseini, “An efficient Dai-Liao type conjugate gradient method by reformulating the CG parameter in the search direction equation,” Journal of Computational and Applied Mathematics, vol. 371, article 112708, 2020.
View at: Publisher Site | Google Scholar
X. Li and Q. Ruan, “A modified PRP conjugate gradient algorithm with trust region for optimization problems,” Numerical Functional Analysis and Optimization, vol. 32, no. 5, pp. 496–506, 2011.
View at: Publisher Site | Google Scholar
N. Andrei, “An acceleration of gradient descent algorithm with backtracking for unconstrained optimization,” Numerical Algorithms, vol. 42, no. 1, pp. 63–73, 2006.
View at: Publisher Site | Google Scholar
P. S. Stanimirovic and M. B. Miladinovic, “Accelerated gradient descent methods with line search,” Numerical Algorithms, vol. 54, no. 4, pp. 503–520, 2010.
View at: Publisher Site | Google Scholar
W. Cheng, “A two-term PRP-based descent method,” Numerical Functional Analysis and Optimization, vol. 28, no. 11–12, pp. 1217–1230, 2007.
View at: Publisher Site | Google Scholar
G. Zoutendijk, “Nonlinear programming, computational methods,” in Integer and Nonlinear Programming, North-Holland, J. Abadie, Ed., pp. 37–86, North-Holland, Amsterdam, 1970.
View at: Google Scholar
N. Andrei, “An unconstrained optimization test functions collection,” Advanced Modeling and Optimization, vol. 10, no. 1, pp. 147–161, 2008.
View at: Google Scholar
I. Bongartz, A. R. Conn, N. Gould, and P. L. Toint, “CUTE: constrained and unconstrained testing environments,” ACM Transactions on Mathematical Software, vol. 21, no. 1, pp. 123–160, 1995.
View at: Publisher Site | Google Scholar
E. D. Dolan and J. J. Moré, “Benchmarking optimization software with performance profiles,” Mathematical Programming, vol. 91, no. 2, pp. 201–213, 2002.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2021 Branislav Ivanov et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies