A New Spectral Three-Term Conjugate Gradient Method for Unconstrained Optimization and Its Application
Abstract:
In this paper, we derive a new conjugate gradient (CG) direction with random parameters which are obtained by minimizing the deviation between search direction matrix and self-scaled memoryless Broyden-Fletcher-Goldfard-Shanno (BFGS) update. We propose a new spectral three-term CG algorithm and establish the global convergence of new method for uniformly convex functions and general nonlinear functions, respectively. Numerical experiments show that our method has nice numerical performance on nonconvex functions and supply chain problems.
1. Introduction
Many problems of relevance in compressing sensing [1], portfolio selection [2] and image restoration [3], [4] can be transformed into the following unconstrained optimization problem
where,
$f:~\mathbb{R}^n \rightarrow \mathbb{R}$ is continuous differentiable. The spectral conjugate gradient (SCG) method [5] is one of the most popular class of algorithm for solving problem Eq. (1), due to its simplicity, low memory requirement and strong convergence performance. Starting from an initial point $x_0\in \mathbb{R}^n$, the iterative formula of the SCG method is given by
where, stepsize $\alpha_k>0$ is generated by line search. The search direction $d_k$ is defined by
where,
$g_k=\nabla f(x_k)$ is the gradient of $f(x)$ at current point $x_k$, $\theta_k$ is a spectral parameter and $\beta_{k-1}$ is a scalar called the conjugate gradient (CG) parameter. There are many choices of $\theta_k$ and $\beta_{k-1}$, which may lead to different computational efficiency and convergence properties [5], [6], [7], [8], [9], [10].
Babaie-Kafaki and Ghanbari [11] proposed a descent family of the Dai-Liao CG (DDL) method with the search direction
where,
$Q_{k+1}$ is called a search direction matrix. They obtained the relationship
where,
By letting the smallest eigenvalue of $A_{k+1}$ has a lower bound greater than 0, they determined parameter $t_k$. Then the search direction $d_{k+1}$ was descent.
Livieris et al. [12] proposed a descent hybrid CG method with the search direction
They obtained the hybrization parameter $\lambda_k$ by minimizing the distance between $P_{k+1}$ and self-scaled memoryless Broyden-Fletcher-Goldfard-Shanno (BFGS) update in the Frobenius norm.
Recently, the spectral three-term CG methods have been paid more attention [13], [14], [15], [16], [17]. Chen and Yang [14] using subspace presented a three-term CG algorithm for large-scale unconstrained optimization. Faramarzi and Amini [15] proposed a spectral three-term Hestenes-Stiefel CG method. Al-Bayati and Abbas [16] gave a robust spectral three-term CG algorithm for solving unconstrained minimization problems. Eslahchi and Bojari [17] proposed a new sufficient descent spectral three-term CG class for large-scale optimization.
Motivated by the above works, in order to make better use of the properties of spectral parameter, we consider the search direction matrix
to propose a new spectral three-term CG method, where scaling parameter $\theta_k$ and $t_k$ are undetermined parameters. The corresponding search direction is
where,
\begin{eqnarray}\nonumber&&\beta_k=\frac{1}{2}\frac{y_k^{\rm T}g_{k+1}}{s_k^{\rm T}y_k}-t_k\frac{s_k^{\rm T}g_{k+1}}{s_k^{\rm T}y_k} ~\mbox{and}~ \gamma_k=\frac{1}{2}\frac{s_k^{\rm T}g_{k+1}}{s_k^{\rm T}y_k}.\nonumber\end{eqnarray}
Based on the idea of the study [12], our main work is to give a new choice of parameters $t_k$ and $\theta_k$ to propose a spectral three-term CG method. The contributions of this article are listed as follows:
$\diamondsuit$ A random parameter which leads to our method more relaxed and elastic, is introduced to construct $\beta_k$ and $\theta_k$ in the search direction.
$\diamondsuit$ The search direction satisfies the sufficient descent condition. Under appropriate assumptions, we give global convergence of new method for general functions.
$\diamondsuit$ The new method has good numerical performance for the objective function with sharp curvature change.
$\diamondsuit$ The new method is applied to the supply chain problem, which shows that the new method is effective.
The motivation and algorithm are analyzed in the next section, we find parameter $t_k$ including a random parameter at each iterate, and obtain parameters $\beta_k$ and $\theta_k$. Then we state a new spectral three-term CG method. In Section 3, global convergence of our method is proved for uniformly convex functions and general nonlinear functions. In Section 4, some numerical experiments and application results are reported. Conclusions are made in the last section.
2. Motivation and Algorithm
In this section, our main aim is to discuss how to choose $t_k$ and propose a random spectral three-term CG method. The search direction is derived by minimizing the deviation between the search direction matrix and a quasi-Newton update, in conjunction with choices of random parameter. Consider model
where,
$\|\cdot\|_F$ denotes the Frobenius norm, $A_{k+1}$ is determined by Eq. (8), $B_{k+1}^{-1}$ is a self-scaled memoryless BFGS matrix
and $\theta_k$ is a scaling parameter. From Eq. (8) and Eq. (11), we have
\begin{eqnarray}\nonumber\|D_{k+1}\|_F^2&=&{\rm tr}(D_{k+1}^{\rm T}D_{k+1}) \nonumber\\&=& \frac{\|s_k\|^4}{(s_k^{\rm T}y_k)^2}t^{2}+ 2\left[(2\theta_k-1)\frac{\|s_k\|^2}{s_k^{\rm T}y_k}-\frac{\|s_k\|^4}{(s_k^{\rm T}y_k)^2}-\theta_k\frac{\|y_k\|^2\|s_k\|^4}{(s_k^{\rm T}y_k)^3}\right]t+\zeta,\nonumber\end{eqnarray}
where,
$\zeta$ is a constant independent of $t$. This is a second-degree polynomial of variable $t$ and the coefficient of $t^2$ is positive. Therefore, the minimum of problem Eq. (10) is
where,
$p_k=\cos^2 \langle s_k, y_k\rangle$ and $\chi_k=\frac{\|y_k\|}{\|s_k\|}$. Instead of the mean value to $p_k$ in the study [18], we set $p_k$ is a random number in the interval $[\underline{m}, ~ \overline{m}]$, where $0< \underline{m} < \overline{m}<\frac{1}{2}.$ There are many possible ways to choose $\theta_k$, we prefer to use
Thus, $t_k$ in Eq. (12) can be regarded as random parameters. Substitute Eq. (12) into Eq. (9), then we get the following new search direction
Now, we state a description of the random spectral three-term conjugate gradient algorithm (RSTTCG) as follows.
RSTTCG Algorithm
Step 0. Given $x_0\in\mathbb{R}^n$, $\varepsilon>0$, $0< \underline{m} < \overline{m}<\frac{1}{2} $ and $0<\rho<\sigma<1$. Let $f_0=f(x_0)$, $g_0=\nabla f(x_0)$, $d_0:=-g_0$ and $k:=0$.
Step 1. If $\|g_k\|\leq\varepsilon$, stop and output $x_k$.
Step 2. Compute $\alpha_k$ satisfying the the strong Wolfe line search conditions
Step 3. Set $x_{k+1}=x_k+\alpha_kd_k$. Calculate $f_{k+1}$, $g_{k+1}$, $s_k$ and $y_k$.
Step 4. Compute $t_k$ by Eq. (12) and the search direction $d_{k+1}$ by Eq. (13) and Eq. (14). Set $k:=k+1$ and go to Step 1.
The following lemma shows the sufficient descent property of search direction.
Lemma 2.1 Let the sequence $\{d_{k+1}\}$ be generated by RSTTCG algorithm. There exists a positive constant $c$ satisfying
Proof By Eq. (16), we have $s_k^{\rm T}y_k=s_k^{\rm T}(g_{k+1}-g_{k})\geq (\sigma-1)s_k^{\rm T}g_{k}>0$. Since $s_k^{\rm T}y_k=\|s_k\|\|y_k\|\cos\langle s_k, y_k\rangle$, then $\cos\langle s_k, y_k\rangle=\sqrt{p_k}>0$. Combined with Eq. (14), we get
\begin{eqnarray}g_{k+1}^{\rm T}d_{k+1}&=& -\theta_k\|g_{k+1}\|^2+\frac{y_k^{\rm T}g_{k+1} g_{k+1}^{\rm T}s_k s_k^{\rm T}y_k}{(s_k^{\rm T}y_k)^2}\nonumber\\&&-\left[1+\frac{\theta_k\chi_k}{\sqrt{p_k}}+(1-2\theta_k)\sqrt{p_k}\chi_k \right]\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k}\nonumber\\&=& -\theta_k\|g_{k+1}\|^2+\frac{y_k^{\rm T}g_{k+1} g_{k+1}^{\rm T}s_k s_k^{\rm T}y_k}{(s_k^{\rm T}y_k)^2}\nonumber\\&&-\left[1+\frac{\theta_k}{p_k}\frac{s_k^{\rm T}y_k}{\|s_k\|^2}+(1-2\theta_k)p_k\frac{\|y_k\|^2}{s_k^{\rm T}y_k}\right]\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k}\nonumber\\&\leq& -\theta_k\|g_{k+1}\|^2-\left[1+\frac{\theta_k}{p_k}\frac{s_k^{\rm T}y_k}{\|s_k\|^2}+(1-2\theta_k)p_k\frac{\|y_k\|^2}{s_k^{\rm T}y_k}\right]\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k} \nonumber\\&& +\frac{1}{2}\frac{(g_{k+1}^{\rm T}s_k)^2\|y_k\|^2+(s_k^{\rm T}y_k)^2\|g_{k+1}\|^2}{(s_k^{\rm T}y_k)^2} \nonumber\\&=& (\frac{1}{2}-\theta_k)\|g_{k+1}\|^2-\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k}\nonumber\\&&-\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k}\left[\frac{\theta_k}{p_k}\frac{s_k^{\rm T}y_k}{\|s_k\|^2}+\Big((1-2\theta_k)p_k-\frac{1}{2}\Big)\frac{\|y_k\|^2}{s_k^{\rm T}y_k}\right]\nonumber\\&\leq& (\frac{1}{2}-\theta_k)\|g_{k+1}\|^2-\frac{(s_k^{\rm T}g_{k+1})^2}{s_k^{\rm T}y_k}\left[1+\frac{\|y_k\|}{\sqrt{p_k}\|s_k\|}\Big(\theta_k+p_k-2\theta_kp_k-\frac{1}{2}\Big)\right]\nonumber\\&\leq& (\frac{1}{2}-\theta_k)\|g_{k+1}\|^2 \leq \left[\frac{1}{2} -\frac{1-\underline{m}}{2(1-\overline{m})}\right]\|g_{k+1}\|^2 = -\frac{\overline{m}-\underline{m}}{2(1-\overline{m})}\|g_{k+1}\|^2.\nonumber\end{eqnarray}
The second of above inequality is from the fact $a^{\rm T}b\leq\frac{1}{2}(\|a\|^2+\|b\|^2)$, in which $a=g_{k+1}^{\rm T}s_k y_k$ and $b=s_k^{\rm T}y_kg_{k+1}$. Combining Eq. (13), let $c=\frac{\overline{m}-\underline{m}}{2(1-\overline{m})}$, the proof is completed. $\Box$
3. Convergence Analysis
In this section, to prove the global convergence of RSTTCG algorithm, we give the following assumptions.
Assumption (i). The level set $\Omega=\{x\in \mathbb{R}^n:~f(x)\leq f(x_0)\}$ is bounded, namely, there exists $\delta>0$ satisfying $\|x\|\leq \delta,~~\forall x\in \Omega.$
Assumption (ii). The gradient of function $f$ is Lipschitz continuous in some neighborhood $\mathbb{N}$ of $\Omega$, namely, there exists $L>0$ satisfying
Based on the above assumptions, we know
Besides, we can easily see that $g(x)$ is bounded, namely, there exists a positive constant $L_1$ such that $\|g(x)\|\leq L_1, ~~\forall x\in \Omega.$
Lemma 3.1 Let the sequence $\{d_{k}\}$ be generated by RSTTCG algorithm. If Assumption (ii) holds, then
Proof Subtracting $g_k^{\rm T}d_k$ from both sides of the left inequality of Eq. (16) and using the Lipschitz condition, we get
\begin{eqnarray}\nonumber(\sigma-1)g_k^{\rm T}d_k\leq (g_{k+1}-g_k)^{\rm T}d_k=y_k^{\rm T}d_k\leq \|y_k\|\|d_k\|\leq \alpha_kL\|d_k\|^2.\end{eqnarray}
Since $0<\sigma<1$ and $d_k$ is a descent direction, then Eq. (20) holds. The proof is completed. $\Box$
Lemma 3.2 Let the sequence $\{d_{k}\}$ be generated by RSTTCG algorithm. If Assumptions (i)-(ii) hold, we have the following Zoutendijk condition
Proof From the first inequality Eq. (15) of the strong Wolfe conditions, Assumption (ii) and Lemma 3.1, we have
\begin{eqnarray}f_k-f_{k+1}\geq-\rho\alpha_kg_k^{\rm T}d_k\geq-\rho\frac{(1-\sigma)(g_k^{\rm T}d_k)^2}{L\|d_k\|^2}.\end{eqnarray}
From Assumption (i), we know $f(x)$ is bounded from below, then Eq. (21) is obtained. The proof is completed. $\Box$
Theorem 3.1 Suppose that Assumptions (i) and (ii) hold. The sequence $\{x_k\}$ is generated by RSTTCG algorithm. If $f$ is a uniformly convex function on $\Omega$, namely, there exists $\mu>0$ such that
Then, we have
$\lim\limits_{k\rightarrow\infty}\|g_k\|=0.$
Proof From the Lipschitz condition Eq. (18), we have
It follows Eq. (22) and the Cauchy inequality that
i.e.,
Then, from Eqs. (23)-(25), we get
Let $\theta_{\max}=\max\{\frac{1-\underline{m}}{2(1-\overline{m})}, \frac{1}{\mu}\}$, we have $\theta_k \leq \theta_{\max}$. From Eq. (26) and Eq. (27), we obtain
Therefore, from the Cauchy inequality, the triangle inequality, Eq. (14), Eq. (23), Eq. (24) and Eq. (28), we have
From Lemma 2.1 and Eq. (29), we have $\frac{(g_{k+1}^{\rm T}d_{k+1})^2}{\|d_{k+1}\|^2} \geq \frac{c^2\|g_{k+1}\|^2}{M^2}.$ Combined with Lemma 3.2, we get
$\sum\limits_{k=0}^\infty \|g_k\|^2<\infty.$
The proof is completed. $\Box$
For the general nonlinear functions, we can establish a weaker convergence result
Lemma 3.3 Suppose that Assumptions~(i) and (ii) hold. Let the sequence $\{x_k\}$ be generated by RSTTCG algorithm, then we have $d_k\neq0$ and
whenever $\inf\{\|g_k\|: k\geq0\}>0$, in which $u_k=\frac{d_k}{\|d_k\|}$.
Proof Define $\tau=\inf\{\|g_k\|: k\geq0\}$, we know $\|g_k\|\geq \tau>0$. From the sufficient descent condition Eq. (17), we have $d_k\neq0$ for each $k$, so $u_k$ is well defined. To prove global convergence, we define $\beta_k^{+}=\max\{\beta_k^{'},0\}$, where $\beta_k^{'}=\frac{1}{2}\frac{y_k^{\rm T}g_{k+1}}{d_k^{\rm T}y_k}-\big[ 1+\frac{\theta_k}{p_k}\frac{s_k^{\rm T}y_k}{\|s_k\|^2}+(1-2\theta_k)p_k\frac{\|y_k\|^2}{s_k^{\rm T}y_k}\big]\frac{s_k^{\rm T}g_{k+1}}{d_k^{\rm T}y_k}$. From Eq. (14), we have
$\frac{d_{k+1}}{\|d_{k+1}\|} =\frac{-\theta_kg_{k+1}}{\|d_{k+1}\|}+\beta_k^{+}\frac{d_k}{\|d_{k+1}\|}+\gamma_k\frac{y_k}{\|d_{k+1}\|}= \frac{-\theta_kg_{k+1}+\gamma_ky_k}{\|d_{k+1}\|}+\beta_k^{+}\frac{\|d_k\|}{\|d_{k+1}\|}\frac{d_k}{\|d_k\|},$
namely,
$ u_{k+1}=\omega_k+\xi_ku_k, $
where,
$\omega_k=\frac{-\theta_kg_{k+1}+\gamma_ky_k}{\|d_{k+1}\|}, \xi_k=\beta_k^{+}\frac{\|d_k\|}{\|d_{k+1}\|}\geq0.$
By using the condition $\|u_{k+1}\|=\|u_k\|=1$, we have $\|\omega_k\|=\|u_{k+1}-\xi_ku_k\|=\|\xi_ku_{k+1}-u_k\|.$ Since $\xi_k\geq0$, it follows that
$\|u_{k+1}-u_k\| \leq\|(1+\xi_k)u_{k+1}-(1+\xi_k)u_k\| \leq \|u_{k+1}-\xi_ku_k\|+\|\xi_ku_{k+1}-u_k\|=2\|\omega_k\|.$
From Eq. (16), we have
By the definition of $\omega_k, \gamma_k$ and Eq. (32), we get
\begin{eqnarray}\nonumber\|\omega_k\|= \frac{\|-g_{k+1}+\gamma_ky_k\|}{\|d_{k+1}\|}&\leq &\frac{\|g_{k+1}\|+\frac{1}{2}\left|\frac{s_k^{\rm T}g_{k+1}}{s_k^{\rm T}y_k}\right|\cdot\| y_k\|}{\|d_{k+1}\|}\nonumber\\&\leq &\left[1+\frac{\sigma}{2(1-\sigma)}\big(1+\frac{L_1}{\tau}\big)\right]\frac{\|g_{k+1}\|}{\|d_{k+1}\|}.\nonumber\end{eqnarray}
If $\|g_{k+1}\|>\tau$, from Lemma 2.1 and Lemma 3.2, we have
$\sum_{k=0}^\infty\frac{c^2\tau^2\|g_{k+1}\|^2}{\|d_{k+1}\|^2} \leq\sum_{k=0}^\infty\frac{c^2\|g_{k+1}\|^4}{\|d_{k+1}\|^2} \leq\sum_{k=0}^\infty\frac{(g_{k+1}^{\rm T}d_{k+1})^2}{\|d_{k+1}\|^2} <+\infty,$
therefore, Eq. (31) holds. $\Box$
Property(*) Consider a method of the form Eq. (2) and Eq. (14), and suppose
We call that a method has Property(*) if there exist constants $b>1$ and $\lambda>0$ such that $|\beta_k^{'}|$ and $\|s_k\|\leq \lambda~~\Rightarrow~~|\beta_k^{'}|\leq \frac{1}{2b}.$
Lemma 3.4 Suppose that Assumptions (i)-(ii) hold. Let the sequence $\{d_k\}$ be generated by RSTTCG algorithm, then RSTTCG algorithm has Property (*).
Proof From Eq. (15) and Eq. (16), we have
By using Assumption (i), Eqs. (32)-(34), we obtain
Define
if $\|s_k\|\leq\lambda$, from Eq. (35) and Eq. (36), we obtain
\begin{eqnarray}|\beta_k^{'}|&\leq& \frac{1}{2}\frac{L\|s_k\|\|g_{k+1}\|}{c(1-\sigma)\tau^2}+\left(1+\frac{L\theta_{\max}}{\underline{m}}+\frac{\overline{m}L^2}{\mu} \right)\frac{\|s_k\|\|g_{k+1}\|}{c(1-\sigma)\tau^2}\nonumber\\&\leq& \left[\frac{1}{2}\frac{L\overline{\tau}}{c(1-\sigma)\tau^2}+\left(1+\frac{L\theta_{\max}}{\underline{m}}+\frac{\overline{m}L^2}{\mu} \right)\frac{\overline{\tau}}{c(1-\sigma)\tau^2}\right]\|s_k\|\nonumber\\&\leq& \left[\frac{1}{2}\frac{L\overline{\tau}}{c(1-\sigma)\tau^2}+\left(1+\frac{L\theta_{\max}}{\underline{m}}+\frac{\overline{m}L^2}{\mu} \right)\frac{\overline{\tau}}{c(1-\sigma)\tau^2}\right]\lambda=\frac{1}{2b}.\nonumber~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\hfill\Box\end{eqnarray}
We will show that if the gradient sequence is bounded away from zero, then a fraction of the steps cannot be too small in next lemma. Let $\mathbb{N}$ be the set of positive integers, $K^{\lambda}:=\{i\in \mathbb{N}: i\geq1,\|s_{i}\|>\lambda\},~~\mbox{for}~~ \lambda>0,$ namely, the set of integers corresponding to steps greater than $\lambda$. Here, we need to discuss groups of $\triangle$ consecutive iterates and let $K_{k,\Delta}^{\lambda}:=\{i\in \mathbb{N}: k\leq i\leq k+\Delta-1, \|s_i\|>\lambda\},$ where $|K_{k,\Delta}^{\lambda}|$ denotes the number of elements of $K_{k,\Delta}^{\lambda}$.
Lemma 3.5 Suppose that Assumptions (i)-(ii) hold. Let the sequences $\{x_k\}$ and $\{d_k\}$ be generated by RSTTCG algorithm. When Eq. (33) holds, there exists $\lambda>0$ such that
$\left|K_{k,\Delta}^{\lambda}\right|>\frac{\Delta}{2},~~\mbox{for}~~\Delta\in \mathbb{N},$
where,
$k\geq k_0$, in which $k_0$ is any index.
Proof Suppose on the contrary that there exists $\lambda>0$ such that $\left|K_{k,\Delta}^{\lambda}\right| \leq\displaystyle\ \frac{\Delta}{2}$ for $\Delta\in \mathbb{N}$ and for any $k\geq k_0$. By Eq. (32), we have
$\|\gamma_ky_k\| =\frac{1}{2}\left|\frac{s_k^{\rm T}g_{k+1}}{s_k^{\rm T}y_k}\right|\|y_k\| \leq \frac{\sigma}{4(1-\sigma)}\big(1+\frac{L_1}{\tau}\big)\|g_{k+1}\|\doteq L_2\|g_{k+1}\|.$
According to the definition of Eq. (14), we have
$\begin{eqnarray}\nonumber\|d_{k+1}\|^2&\leq& \left(\beta_k^{'}\|d_k\|+\|-g_{k+1}+\gamma_ky_k\|\right)^2\leq 2\beta_k^{'2}\|d_k\|^2+2\|-g_{k+1}+\gamma_ky_k\|^2 \nonumber\\&\leq& 2\beta_k^{'2}\|d_k\|^2+2(2\|g_{k+1}\|^2+2\|\gamma_ky_k\|^2)\leq 2\beta_k^{'2}\|d_k\|^2+4\big(1+L_2^2\big)\|g_{k+1}\|^2,\nonumber\end{eqnarray}$
the above inequalities are established based on $2ab\leq a^2+b^2$ for any scalars $a$ and $b$, so $(a+b)^2\leq 2a^2+2b^2$. By induction, we have
for any given index $l\geq k_0+1$, where $c_1$ depends on $\|d_{k_0-1}\|$, not depends on $l$. Next, we consider $2\beta_{l-1}^{'2} 2\beta_{l-2}^{'2}\cdots2\beta_{k}^{'2},$ where $k_0\leq k\leq l-1$. We divide the $2(l-k)$ factors of Eq. (37) into groups of each $2\Delta$ elements, namely, if $\Lambda:=(l-k)/\Delta$, then Eq. (37) can be divided into $\Lambda$ or $\Lambda+1$ groups
and a possible group
where,
$l_i=l-1-(i-1)\Delta$ for $i=1,2,\cdots,\Lambda+1$, and $k_i=l_{i+1}+1$ for $i=1,2,\cdots,\Lambda$. It is clear that $k_i\geq k_0$ for $i=1, 2, \cdots, \Lambda$, from assumption condition, we get $q_i:=\left|K_{k_i,\Delta}^\lambda\right|\leq\frac{\Delta}{2}.$ Thus, there are $q_i$ indices $j$ such that $\|s_j\|>\lambda$ and $(\Delta-q_i)$ indices $j$ such that $\|s_j\|\leq\lambda$ on $[k_i, k_i+\Delta-1]$.
From Eq. (35), we have $b > \frac{\overline{\tau}^2}{c(1-\sigma)\tau^2} >1$, i.e., $2b^2 >1$. In conjunction with $2q_i-\Delta\leq 0$, we have $2\beta_{l_i}^{'2}\cdots2a\beta_{k_i}^{'2}\leq 2^\Delta b^{2q_i}(\frac{1}{2b})^{2(\Delta-q_i)}=(2b^2)^{2q_i-\Delta} \leq 1.$ So every item in Eq. (38) is less than or equal to 1, and so is their product. In Eq. (39), we have $2\beta_{l_{\Lambda+1}}^{'2}\cdots2\beta_{k}^{'2}\leq(2b^2)^\Delta$. Then, we get
$\|d_l\|^2\leq c_2(l-k_0+2),$
where,
$c_2>0$ and independent of $l$. Furthermore, $\sum\limits_{k\geq0}\frac{1}{\|d_k\|^2}=\infty.$ But from sufficient condition Eq. (17), Zoutendijk condition Eq. (21) and Eq. (33), we have
$c^2\tau^4\sum_{k\geq0}\frac{1}{\|d_k\|^2}\leq c^2\sum_{k\geq0}\frac{\|g_k\|^4}{\|d_k\|^2}\leq\sum_{k\geq0}\frac{(g_k^{\rm T}d_k)^2}{\|d_k\|^2}<\infty.$
It leads to a contradiction. The proof is completed. $\Box$
Theorem 3.2 Suppose that Assumptions (i)-(ii) hold. Let the sequence $\{x_k\}$ be generated by RSTTCG algorithm, then Eq. (30) holds.
Proof Suppose on the contrary that we can get a contradiction similarly to Theorem 4.3 in the study [19]. $\Box$
4. Numerical Results
In this section, the numerical performance of RSTTCG algorithm will be listed. All experiments were done on a PC with CPU 2.40 GHz and 2.00 GB RAM using Matlab R2015b.
Two classes of test problems are selected here which are listed in Table 1. One class is drawn from the CUTEr library [20], and the other class come from Andrei [13]. A total of twenty-eight test functions with eighty problems from different dimensions are considered.
P. | Functions | Dim. | P. | Functions | Dim. |
1 | Freudenstein and Roth | 2 | 41 | Chebyquad | 1000 |
2 | Powell badly scaled | 2 | 42 | Chebyquad | 5000 |
3 | Brown badly scaled | 2 | 43 | Chebyquad | 10000 |
4 | Beale | 2 | 44 | Chebyquad | 50000 |
5 | Helical valley | 3 | 45 | Broyden banded | 1000 |
6 | Wood | 4 | 46 | Broyden banded | 5000 |
7 | Biggs EXP6 | 6 | 47 | Broyden banded | 10000 |
8 | Extended Rosenbrock | 1000 | 48 | Broyden banded | 50000 |
9 | Extended Rosenbrock | 5000 | 49 | Generalized Rosebrock | 1000 |
10 | Extended Rosenbrock | 10000 | 50 | Generalized Rosebrock | 5000 |
11 | Extended Rosenbrock | 50000 | 51 | Generalized Rosebrock | 10000 |
12 | Extended Powell singular | 1000 | 52 | Generalized Rosebrock | 50000 |
13 | Extended Powell singular | 5000 | 53 | Boundary value | 1000 |
14 | Extended Powell singular | 10000 | 54 | Boundary value | 5000 |
15 | Extended Powell singular | 50000 | 55 | Boundary value | 10000 |
16 | Penalty function I | 1000 | 56 | Boundary value | 50000 |
17 | Penalty function I | 5000 | 57 | Integral equation | 1000 |
18 | Penalty function I | 10000 | 58 | Integral equation | 5000 |
19 | Penalty function II | 1000 | 59 | Integral equation | 10000 |
20 | Penalty function II | 5000 | 60 | Integral equation | 50000 |
21 | Penalty function II | 10000 | 61 | Broyden tridiagonal | 1000 |
22 | Gaussian | 3 | 62 | Broyden tridiagonal | 5000 |
23 | Gaussian | 3 | 63 | Broyden tridiagonal | 10000 |
24 | Box | 3 | 64 | Broyden tridiagonal | 50000 |
25 | Box | 3 | 65 | Separable cubic | 1000 |
26 | Variable dimension | 1000 | 66 | Separable cubic | 5000 |
27 | Variable dimension | 5000 | 67 | Separable cubic | 10000 |
28 | Variable dimension | 10000 | 68 | Separable cubic | 50000 |
29 | Variable dimension | 50000 | 69 | Nearly separable | 1000 |
30 | Watson | 1000 | 70 | Nearly separable | 5000 |
31 | Watson | 5000 | 71 | Nearly separable | 10000 |
32 | Watson | 10000 | 72 | Nearly separable | 50000 |
33 | Watson | 50000 | 73 | Yang tridiagonal | 1000 |
34 | Brown and Dennis | 4 | 74 | Yang tridiagonal | 5000 |
35 | Brown and Dennis | 4 | 75 | Yang tridiagonal | 10000 |
36 | Trigonometric | 500 | 76 | Allgower | 1000 |
37 | Trigonometric | 1000 | 77 | Allgower | 5000 |
38 | Trigonometric | 5000 | 78 | Allgower | 10000 |
39 | Trigonometric | 10000 | 79 | Schittkowski 302 | 5000 |
40 | Trigonometric | 50000 | 80 | Schittkowski 302 | 10000 |
We compare RSTTCG algorithm against DDL method [11] which possess better numerical performance. When
$\theta_k=\max \left\{\frac{1-\underline{m}}{2(1-\overline{m})}, \frac{\|s_k\|^2}{s_k^{\rm T}y_k}\right\}~ \mbox {and} ~ \theta_k=\max \left\{\frac{1-\underline{m}}{2(1-\overline{m})}, \frac{s_k^{\rm T}y_k}{\|y_k\|^2}\right\}$
are chosen, RSTTCG Algorithm are denoted by ``RSTTCG1" and ``RSTTCG2", respectively. All test methods are terminated when satisfies condition
We set parameter as $\varepsilon=10^{-5}$, $\rho=0.1$, $\sigma=0.6, \underline{m}=0.05, \overline{m}=0.45$. And we set $p=0.8$ and $q=0.1$ in DDL algorithm.
The performance profile of four algorithms, included number of iterations, function evaluations, gradient evaluations and CPU time, was analyzed using the profiles of Dolan and Mor$\acute{\rm e}$ [21]. In a performance profile plot, the horizontal axis gives the percentage $(\tau)$ of the test problems for which a method is the fastest (efficiency), while the vertical side gives the percentage $(\psi)$ of the test problems that are successfully solved by each of the methods.
Please do not use the headers or the footers because they are reserved for the technical editing by editors. If necessary, explain the concepts in a table or figure by adding a note below that table or figure.
Figure 1, Figure 2, Figure 3, Figure 4 plot the performance profiles for the number of iterations, the number of function evaluations, the number of gradient evaluations, and the CPU time, respectively. As can be observed in Figure 1, Figure 2, Figure 3, Figure 4, the curves corresponding to RSTTCG1 stays others curves representing RSTTCG2 and DDL methods. This indicates that RSTTCG1 outperforms RSTTCG2 and DDL in all aspects. Furthermore, when $\tau<2$, the performance of RSTTCG1 is slightly better than RSTTCG2. Whereas, when $\tau \geq 2$, the performance of RSTTCG2 is slightly better than RSTTCG1. We deem that RSTTCG1 may be more competitive than RSTTCG2.




In this section, we use RSTTCG algorithm to study the profit-maximization pricing strategy of supply chain of fresh agricultural products led by suppliers under centralized decision making. Consider a two-level fresh produce supply chain in a single-cycle production-sales mode composed of a single fresh produce supplier and a single retailer. The fresh produce supplier, as the leader of the Stackeblerg game, supplies both ordinary fresh produce (ofp) and green fresh produce (gfp) of the same variety to the retailer as the follower. Fresh agricultural products supply chain (FPSC) system the overall decision-making structure is shown in Figure 5.

Because these two kinds of fresh products are substitutable, there is competition in the demand market, based on the demand function theory of substitute price competition, the demand function of two fresh agricultural products is assumed as follows
where,
$q_1$, $q_2$ represent the market demand of gfp and ofp, respectively, $a$ represents the total potential market capacity of fresh agricultural products, $p_1$, $p_2$ represent the retail price of gfp and ofp, respectively, $b$ is the price sensitivity coefficient, $r$ is the competitive substitution coefficient of the two products, and satisfy $b>r>0$, $\theta (0\leq\theta\leq 1)$ is the freshness of fresh produce when it arrives at the retailer’s store. In centralized decision-making, we regard suppliers and retailers as subjects with identical interests, and both sides cooperate to maximize FPSC profits.
Now, under the establishment of centralized decision, the profit function of fresh agricultural products supply chain is as follows
where,
$\beta$ $(0<\beta<1)$ represents the quantity loss of fresh produce when it reaches the retailer’s store. $c_1$, $c_2$ represents the unit production cost of gfp and ofp, respectively. Obviously, $p_1>p_1>0$ and $c_1>c_2>0$. We transform Eq. (42) into the following optimization problem
With reference to the setting of the parameters in the relevant literature [22], we set: $a=50$, $b=2$, $c_1=4$, $c_2=2$, $r=1.5$, $\beta=0.2$, $\theta=0.85$. These values satisfy the theoretical proof in reference [22] and can guarantee that the optimal value has practical significance. We choose different initial points and use RSTTCG algorithm to solve the optimization problem Eq. (43), the results are shown in Table 2 and Table 3. In Table 2 and Table 3, $k$ represents the number of iterations.
Initial Point | $k$ | $p_1^{*}$ | $p_2^{*}$ | $q_1^{*}$ | $q_2^{*}$ | $\pi^{*}$ |
(1;1) | 7 | 45.0265 | 43.7651 | 21.2879 | 26.4818 | 1.9449×10³ |
(10;10) | 6 | 44.9792 | 43.7150 | 21.3107 | 26.5163 | 1.9449×10³ |
(30;30) | 8 | 44.9970 | 43.7427 | 21.3177 | 26.4825 | 1.9449×10³ |
(50;50) | 6 | 45.0255 | 43.7949 | 21.3427 | 26.4100 | 1.9449×10³ |
(100;100) | 7 | 44.9776 | 43.7113 | 21.3079 | 26.5222 | 1.9449×10³ |
(1000;1000) | 6 | 44.9809 | 43.7130 | 21.3033 | 26.5238 | 1.9449×10⁶ |
Initial Point | $k$ | $p_1^{**}$ | $p_2^{**}$ | $q_1^{**}$ | $q_2^{**}$ | $\pi^{**}$ |
(1;1) | 7 | 44.9459 | 43.6127 | 21.2085 | 26.6981 | 1.9448×10³ |
(10;10) | 5 | 44.9491 | 43.7202 | 21.3907 | 26.4509 | 1.9449×10³ |
(30;30) | 6 | 45.0035 | 43.7488 | 21.3132 | 26.4796 | 1.9449×10³ |
(50;50) | 6 | 45.0234 | 43.7905 | 21.3400 | 26.4165 | 1.9449×10³ |
(100;100) | 6 | 45.0273 | 43.7657 | 21.2870 | 26.4818 | 1.9449×10³ |
(1000;1000) | 5 | 45.0032 | 43.7530 | 21.3213 | 26.4692 | 1.9449×10⁶ |
As can be seen from Table 2 and Table 3, with certain parameters, RSTTCG algorithm can be used to solve the optimization problem, so as to obtain the optimal pricing strategy with maximum profit in the supply chain led by suppliers under centralized decision making. In addition, the global convergence and effectiveness of RSTTCG algorithm are verified according to different initial values and the number of iterations.
5. Conclusion
In this paper, based on the random technique and a new search direction, a class of spectral three-term spectral three-term CG methods with random parameters are proposed. The random parameters are introduced to simplify the derived parameter. This is achieved by minimizing distance between the symmetric matrix $A_{k+1}$ and memoryless BFGS matrix in Frobenius norm. Global convergence of new algorithm is proved for uniformly convex functions and general nonlinear functions. Some classical test problems are selected for numerical experiments and compared with other methods to verify the effectiveness of proposed algorithm. Numerical experiments show that our methods have nice numerical performance, more relaxed and elastic.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.
