Abstract
A trust-region-based BFGS method is proposed for solving symmetric nonlinear equations. In this given algorithm, if the trial step is unsuccessful, the linesearch technique will be used instead of repeatedly solving the subproblem of the normal trust-region method. We establish the global and superlinear convergence of the method under suitable conditions. Numerical results show that the given method is competitive to the normal trust region method.
1. Introduction
Consider the following system of nonlinear equations:
where is continuously differentiable, and the Jacobian of is symmetric for all Let be the norm function defined by Then the nonlinear equations (1.1) is equivalent to the following global optimization problem:
There are two ways for nonlinear equations by numerical methods. One is the line search method and the other is the trust region method. For the line search method, the following iterative formula is often used to solve (1.1):
where is the th iteration point, is a steplength, and is search direction. To begin, we briefly review some methods for (1.1) by line search technique. First, we give some techniques for Brown and Saad [1] proposed the following line search method to obtain the stepsize
where Based on this technique, Zhu [2] gave the nonmonotone line search technique:
and and is a nonnegative integer. From these two techniques (1.4) and (1.5), it is easy to see that the Jacobian matrix must be computed at every iteration, which will increase the workload especially for large-scale problems or this matrix is expensive to calculate. Considering these points, we [3] presented a new backtracking inexact technique to obtain the stepsize :
where and is a solution of the system of linear (1.15). We established the global convergence and the superlinear convergence of this method. The numerical results showed that the new line search technique is more effective than the normal methods. Li and Fukashima [4] proposed an approximate monotone line search technique to obtain the step-size satisfying
where and are positive constants, is the smallest nonnegative integer such that (1.7), and satisfies
Combining the line search (1.7) with one special BFGS update formula, they got some better results (see [4]). Inspired by their idea, Wei [5] and Yuan [6–8] presented several approximate methods. Further work can be found in [9].
Second, we present some techniques for One of the most effective methods is Newton method. It normally requires a fewest number of function evaluations, and it is very good at handling ill-conditioning. However, its efficiency largely depends on the possibility of solving a linear system efficiently which arises when computing the search in each iteration:
Moreover, the exact solution of the system (1.9) could be too burdensome, or it is not necessary when is far from a solution [10]. Inexact Newton methods [2, 3, 10] represent the basic approach underlying most of the Newton-type large-scale algorithms. At each iteration, the current estimate of the solution is updated by approximately solving the linear system (1.9) using an iterative algorithm. The inner iteration is typically “truncated” before the solution to the linear system is obtained. Griewank [11] firstly proposed the Broyden's rank one method for nonlinear equations and obtained the global convergence. At present, a lot of algorithms have been proposed for solving these two problems (1.1) and (1.2)(see [12–22] etc.).
Trust region method is a kind of important and efficient methods in the area of nonlinear optimization. This method can be traced back to the works of Levenberg [17] and Marquardt [18] on nonlinear least-squares problems and the work of Goldfeld et al. [23] for unconstrained optimization. Powell [24] was the first to establish the convergence result of trust region method for unconstrained optimization. Fletcher [25, 26] firstly proposed trust region algorithms for linearly constrained optimization problems and nonsmooth optimization problems, respectively. This method has been studied by many authors [15, 27–31] and has been applied to equality constrained problems [32–34]. Byrd et al. [35], Fan [36], Powell and Yuan [37], Vardi [38], Yuan [39, 40], Yuan et al. [41], and Zhang and Zhu [42] proposed various trust region algorithms for constrained optimization problems and established the convergence. Fan [36], Yuan [39], and Zhang [43] presented the trust region algorithms for nonlinear equations and got some results.
The normal trust-region subproblem for nonlinear equations is to find the trial step such that
where is a scalar called the trust region radium. Define the predicted descent of the objective function at th iteration by
the actual descent of by
and the ratio of actual descent to predicted descent:
For the normal trust region algorithm, if ( this case is called a successful iteration), the next iteration is and go to the next step; otherwise reduce the trust region radium and solve this subproblem (1.10) repeatedly. Sometimes, we must do this work many times and compute the Jacobian matrix and at every time, which obviously increases the work time and workload, especially for large-scale problems. Even more detrimental, the trust region subproblem is not very easy (see [36, 39] etc.) to be solved for most of the practical problems.
In order to alleviate the above bad situation that traditional algorithms have to compute Jacobian matrix and at each and every iteration while repeatedly resolving the trust region subproblem, in this paper, we would like to rewrite the following trust-region subproblem as
where matrix is the approximation to the Jacobian matrix of at . Due to the boundness of the region , (1.14) has a solution regardless of definiteness (see [43]). This implies that it is valid to adopt a BFGS update formula to generate for trust region methods and the BFGS update is presented as follows:
where Define the predicted descent of the objective function at th iteration by
the actual descent of by
and the ratio of actual descent to predicted descent:
If ( called a successful iteration ), the next iteration is Otherwise, we use a search technique to obtain the steplength and let the next iteration be Motivated by the idea of the paper [4], we propose the following linesearch technique to obtain :
where , , and are some positive constants. In Section 3, we will show (1.19) is well-defined. Here and throughout this paper, denotes the Euclidian norm of vectors or its induced matrix norm. is replaced by
In the next section, the proposed algorithm for solving (1.1) is given. The global and superlinear convergence of the presented algorithm are stated in Section 3 and Section 4, respectively. The numerical results of the method are reported in Section 5.
2. Algorithms
Algorithm 2.1. Initial: choose , , , , . Let ;Step 1: Let ;Step 2: If stop. Otherwise go to Step 3;Step 3: Solve the subproblem (1.14) with to get ;Step 4: If Go to Step 5; Otherwise Let and go to Step 6; Step 5: Let be the smallest nonnegative integer such that (1.19) holds for . Let and , ;Step 6: Update to get by (1.15). Let . Go to Step 2. Here we also give a normal trust-region method for (1.1) and call it Algorithm 2.2.
Algorithm 2.2 (the normal Trust-Region Algorithm [44]). Initial: Given a starting point is the initial trust region radium, an upper bound of trust region radius Set .Step 1: If stop. Otherwise, go to Step 2.Step 2: Solve the trust-region subproblem (1.10) to obtain .Step 3: Let if set If and let Otherwise, let Step 4: If let and go to Step 5; otherwise, let go to Step 2.Step 5: Set Go to Step 1.
Remark 2.3. By we have the following approximate relations: Since satisfies the secant equation and is symmetric, we have approximately This means that approximates along direction
3. The Global Convergence
In this section, we will establish the global convergence of Algorithm 2.1. Let be the level set defined by
which is bounded.
Assumption 1. (A) is continuously differentiable on an open convex set containing .
(B) The Jaconbian of is symmetric and bounded on and there exists a positive constant such that
(C) is positive definite on ; that is, there is a constant such that
(D) is differentiable and its gradient satisfies
where is the Lipschitz constant. By Assumptions 1(A) and 1(B), it is not difficult to get the following inequality:
According to Assumptions 1(A) and 1(C), we have
where which means that the update matrix is always positive definite. By (3.5) and (3.6), we have
Lemma 3.1 ([see Theorem 2.1 in [45]]). Suppose that Assumption 1 holds. Let be updated by BFGS formula (1.15) and let be symmetric and positive definite. For any and satisfy (3.7). Then there exist positive constants and such that, for any positive integer hold for at least value of
Considering the subproblem (1.14), we give the following assumption similar to (1.14). Similar to [2], the following assumption is needed.
Assumption 2. is a good approximation to that is, and satisfies where is a small quantity, and
Lemma 3.2. Let Assumption 2 hold. Then is descent direction for at that is,
Proof. Let be the residual associated with so that : So we have Therefore, taking the norm in the right-hand side of the above equality, we have that from Assumption 2 Hence, for the lemma is satisfied.
According to the above lemma, it is easy to deduce that the norm function is descent, which means that is true.
Lemma 3.3. Let be generated by Algorithm 2.1 and suppose that Assumption 2 holds. Then . Moreover, converges.
Proof. By Lemma 3.2, we have . Then we conclude from Lemma 3.3 in [46] that converges. Moreover, we have for all This implies that .
Lemma 3.4. Let Assumption 1 hold. Then the following inequalities hold.
Proof. Since the update matrix is positive definite. Then, problem (1.14) has a unique solution , which together with some multiplier satisfies the following equations: From (3.18), we can obtain By (3.19) and (3.8), we get (3.16), which also imply that the inequality (3.17) holds.
The next lemma will show that (1.19) is reasonable, and then Algorithm 2.1 is well defined.
Lemma 3.5. Let Assumptions 1(D) and 2 hold. Then there exists a step-size such that (1.19) in a finite number of backtracking steps.
Proof. From Lemma 3.8 in [1] we have that in a finite number of backtracking steps, must satisfy By (3.12) and (3.14), let and we have where the last inequality follows (3.16) and (3.17). By let then we obtain (1.19). The proof is complete.
Lemma 3.6. Let be generated by the Algorithm 2.1. Suppose that Assumptions 1 and 2 hold. Then one has In particular, one has
Proof. By (3.8) and (3.19), we have From Step 4 of Algorithm 2.1, if is true, we get otherwise, if is true, by Step 5 of Algorithm 2.1, (3.8), and (3.26), we can obtain By Lemma 3.5, we know that (1.19) can be satisfied in a finite number of backtracking steps, which means that there exists a constant satisfying for all By (3.26) and (3.27), we have where According to (3.28), we get and by Lemma 3.3, we know that is convergent. Therefore, we deduce that (3.23) holds. According to (3.23), it is easy to deduce (3.24). The proof is complete.
Lemma 3.7. Suppose that Assumptions 1 and 2 hold. There are positive constants such that for any , if , then the following inequalities hold:
Proof. We will prove this lemma in the following two cases.Case 1 (). By (3.18), we have and . Together with (3.8) and (3.19), we get Then (3.30) holds with and .Case 2 (). From (3.19) and (3.8), we have Then, we get By (3.10) and (3.8), it is easy to deduce that So we obtain Using (3.20), we have Therefore, (3.30) holds. The proof is complete.
In the next theorem, we establish the global convergence of Algorithm 2.1.
Theorem 3.8. Let be generated by Algorithm 2.1 and the conditions in Assumptions 1 and 2 hold. Then one has
Proof. By Lemma 3.6, we have Combining (3.8) and (3.36), we get Together with (3.30), we obtain (3.35). The proof is complete.
4. The Superlinear Convergence Analysis
In this section, we will present the superlinear convergence of Algorithm 2.1.
Assumption 3. is Hölder continuous at ; that is, for every in a neighborhood of there are positive constants and such that where stands for the unique solution of (1.1) in
Lemma 4.1. Let be generated by Algorithm 2.1 and the conditions in Assumptions 1 and 2 hold. Then, for any fixed , one has Moreover, one has where .
Proof. Using Assumption 1, we can have the following inequality: By (3.8) and (3.30), we have Together with (3.28), we get and let Suppose that there exists a positive integer as (3.8) holds. Then we obtain where This together with (4.4) shows that holds for all large enough. Therefore, for any we have (4.2). Notice that from (4.2), we can get (4.3).
Lemma 4.2. Let Assumptions 1, 2, and 3 hold. Then, for all sufficiently large, there exists a positive constant such that where .
Proof. From Theorem 3.8 and (4.4), it is not difficult to get Then (4.1) holds for all large enough. Using the mean value theorem, for all sufficiently large, we have where . Therefore, the inequality of (4.8) holds.
Lemma 4.3. Let Assumptions 1, 2, and 3 hold and let {} be generated by Algorithm 2.1. Denote , . Then, for all large , there are positive constants , and such that where , is the Frobenius norm of a matrix and is defined as follows: In particular, and are bounded.
Proof. From (1.15), we have where the last inequality follows the inequality (49) of [47]. Hence, (4.10) holds. By (4.8), in a way similar to that of [46], we can prove that (4.11) holds and and are bounded. The proof is complete.
Lemma 4.4. Let be generated by Algorithm 2.1 and the conditions in Assumptions 1, 2 and 3 hold. Then where
Proof. In a similar way to [46], it is not difficult to obtain On the other hand, we have where the last inequality follows from (4.8). We know that and are bounded, and is positive definite. By (3.5), we get Combining (4.15) and (4.17), we conclude that (4.14) holds. The proof is complete.
Theorem 4.5. Let the conditions in Assumptions 1, 2 and 3 hold. If in (3.10). Then the sequence generated by Algorithm 2.1 converges to superlinearly for .
Proof. For all we get where the last inequality follows (3.10). By (3.5), we have Dividing both sides by we get Substituting this into (4.18), we can obtain which means that Since and as by (4.14) and (3.10), we have Using (3.16), we get Considering (4.4), we have Therefore, we get the result of the superlinear convergence.
5. Numerical Results
In this section, we test the proposed BFGS trust-region method on symmetric nonlinear equations and compare it with Algorithm 2.2. The following problems with various sizes will be solved.
Problem 1. The discretized two-point boundary value problem like the problem in [48] is where is the tridiagonal matrix given by and with
Problem 2. Unconstrained optimization problem is with Engval function [49] defined by The related symmetric nonlinear equation is where with In the experiments, the parameters in Algorithm 2.1 were chosen as , , , and We obtain from subproblem (1.14) by the well-known Dogleg method. The parameters in Algorithm 2.2 were chosen as and Since the matrices will be singular, we solve (1.10) by to obtain The program was coded in We stopped the iteration when the condition was satisfied. If the iteration number is larger than one thousand, we also stop this program and this method is considered to be failed. For Algorithm 2.1, Tables 1(a) and 1(b) and Tables 2(a) and 2(b) show the performance of the method need to solve Problem 1 and Problem 2, respectively. For Algorithm 2.2, Tables 1(c) and 1(d) and Tables 2(c) and 2(d) show the performance of the normal trust region method need to solve Problem 1 and Problem 2, respectively. The columns of the tables have the following meaning:Dim: the dimension of the problem,NI: the total number of iterations,NG: the number of the function evaluations,EG: the norm of the function evaluations.
From Tables 1(a)–2(d), it is not difficult to see that the proposed method performs better than the normal method does. Furthermore, the performance of Algorithm 2.1 hardly changes with the dimension increasing. Overall, the given method is competitive to the normal trust region method.
6. Discussion
We give a trust-region-based BFGS method and establish its convergent results in this paper. The numerical results show that this method is promising. In fact, this problem (1.1) can come from unconstrained optimization problem and an equality constrained optimization problem (for details see [4]). There are some other practical problems, such as the saddle point problem, the discretized two-point boundary value problem, and the discretized elliptic boundary value problem, take the form of (1.1) with symmetric Jacobian (see, e.g., Chapter 1 in [50]). This presented method can also extend to solve the normal nonlinear equations.
Acknowledgments
The authrs are very grateful to anonymous referees and the editors for their valuable suggestions and comments, which improve their paper greatly. This work is supported by China NSF Grands 10761001 and the Scientific Research Foundation of Guangxi University (Grant no. X081082).