A smoothed NR neural network for solving nonlinear convex programs with second-order cone constraints
Xinhe Miao
a,1, Jein-Shan Chen
b,⇑,2, Chun-Hsu Ko
caDepartment of Mathematics, School of Science, Tianjin University, Tianjin 300072, PR China
bDepartment of Mathematics, National Taiwan Normal University, Taipei 11677, Taiwan
cDepartment of Electrical Engineering, I-Shou University, Kaohsiung 84001, Taiwan
a r t i c l e i n f o
Article history:
Received 2 January 2012
Received in revised form 22 June 2013 Accepted 16 October 2013
Available online 24 October 2013
Keywords:
Merit function Neural network NR function Second-order cone Stability
a b s t r a c t
This paper proposes a neural network approach for efficiently solving general nonlinear convex programs with second-order cone constraints. The proposed neural network model was developed based on a smoothed natural residual merit function involving an uncon- strained minimization reformulation of the complementarity problem. We study the exis- tence and convergence of the trajectory of the neural network. Moreover, we show some stability properties for the considered neural network, such as the Lyapunov stability, asymptotic stability, and exponential stability. The examples in this paper provide a further demonstration of the effectiveness of the proposed neural network. This paper can be viewed as a follow-up version of[20,26]because more stability results are obtained.
Ó2013 Elsevier Inc. All rights reserved.
1. Introduction
In this paper, we are interested in finding a solution to the following nonlinear convex programs with second-order cone constraints (henceforth SOCP):
minfðxÞ s:t: Ax¼b
gðxÞ 2K
ð1Þ
whereA2Rmnhas full row rank,b2Rm,f:Rn!R; g¼ ½g1;. . .;glT:Rn!Rlwithfandgi’s being two order continuous differentiable and convex onRn, andKis a Cartesian product of second-order cones (also called Lorentz cones), expressed as
K¼Kn1Kn2 KnN
withN,n1,. . .,nNP1,n1+ +nN=land
Kni :¼ fðxi1;xi2;. . .;xiniÞT2Rnij kðxi2;. . .;xiniÞk6xi1g:
0020-0255/$ - see front matterÓ2013 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.ins.2013.10.017
⇑Corresponding author. Address: National Center Theoretical Sciences, Taipei Office, Taiwan. Tel.: +886 2 77346641; fax: +886 2 29332342.
E-mail addresses:xinhemiao@tju.edu.cn(X. Miao),jschen@math.ntnu.edu.tw(J.-S. Chen),chko@isu.edu.tw(C.-H. Ko).
1The author’s work is also supported by National Young Natural Science Foundation (No. 11101302) and The Seed Foundation of Tianjin University (No.
60302041).
2The author’s work is supported by National Science Council of Taiwan.
Contents lists available atScienceDirect
Information Sciences
j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / i n s
Here,k kdenotes the Euclidean norm andK1means the set of nonnegative realsRþ. In fact, the problem(1)is equivalent to the following variational inequality problem, which is to findx2Dsatisfying
hrfðxÞ;yxiP0; 8y2D;
whereD¼ fx2RnjAx¼b; gðxÞ 2Kg. Many problems in the engineering, transportation science, and economics commu- nities can be solved by transforming the original problems into the mentioned convex optimization problems or variational inequality problems, see[1,7,10,17,23].
Many studies have proposed computational approaches to solve convex optimization problems. Examples of these meth- ods include the interior-point method[29], merit function method[5,16], Newton method[18,25], and projection method [10]. However, real-time solutions are imperative in many applications, such as force analysis in robot grasping and control applications. The traditional optimization methods may not be suitable for these applications because of stringent compu- tational time requirements. Therefore, a feasible and efficient method is required to solve real-time optimization problems.
The neural network method is an ideal method for solving real-time optimization problems. Compared with previous meth- ods, the neural network method has an advantage in solving real-time optimization problems. Hence, researchers have developed many continuous-time neural networks for constrained optimization problems. The literature contains many studies on neural networks for solving real-time optimization problems, please see[4,9,12,14,15,19–22,27,30–34,36]and references therein.
Neural networks stemmed back from McCulloch and Pitts’ pioneering work a half century ago, and neural networks were first introduced to the optimization domain in the 1980s[13,28]. The essence of neural network method for optimization[6]
is to establish a nonnegative Lyapunov function (or energy function) and a dynamic system that represents an artificial neu- ral network. This dynamic system usually adopts the form of a first-order ordinary differential equation. For an initial point, the neural network is likely to approach its equilibrium point, which corresponds to the solution to the considered optimi- zation problem.
This paper presents a neural network method to solve general nonlinear convex programs with second-order cone con- straints. In particular, we consider the Karush–Kuhn–Tucker (KKT) optimality conditions of the problem(1), which can be transformed into a second-order cone complementarity problems (SOCCP), as well as some equality constraints. Following a reformulation of the complementarity problem, an unconstrained optimization problem is formulated. A smoothed natural residual (NR) complementarity function is then used to construct a Lyapunov function and a neural network model. At the same time, we show the existence and convergence of the solution trajectory for the dynamic system. This study also inves- tigates the stability results, such as the Lyapunov stability, the asymptotic stability, and the exponential stability. We want to point out that the optimization problem considered in this paper is more general than the one studied in[20]whereg(x) =x is investigated therein. From[20], for solving the specific SOCP (i.e.,g(x) =x), we know that the neural network based on the cone projection function has better performance than the one based on the Fischer–Burmeister function in most cases (ex- cept for some oscillating cases). In light of considering this phenomenon, we employ a neural network model based on the cone projection function for a more general SOCP. Thus, this paper can be viewed as a follow-up of[20]in this sense. Nev- ertheless, the neural network model studied here is not exactly the same as the one considered in[20]. More specifically, we consider a neural network based on the smoothed NR function which was studied in[16]. Why do we make such a change?
As in Section4, we can establish various stability results, including exponential stability for the proposed neural network that were not achieved in[20]. In addition, the second neural network studied in[27](for various types of problems) is also similar to the proposed network. Again, the stability is not guaranteed in that study, but three stabilities are proved here.
The remainder of this paper is organized as follows. Section2presents stability concepts and provides related results.
Section3describes the neural network architecture, which is based on the smoothed NR function, to solve the problem (1). Section4presents the convergence and stability results of the proposed neural network. Section5shows the simulation results of the new method. Finally, Section6gives the conclusion of this paper.
2. Preliminaries
In this section, we briefly recall background materials of the ordinary differential equation (ODE) and some stability con- cepts regarding the solution of ODE. We also present some related results that play an essential role in the subsequent analysis.
LetH:Rn!Rnbe a mapping. The first order differential equation (ODE) means du
dt¼HðuðtÞÞ; uðt0Þ ¼u02Rn: ð2Þ
We start with the existence and uniqueness of the solution of Eq.(2). Then, we introduce the equilibrium point of(2)and define various stabilities. All of these materials can be found in a typical ODE textbook, such as[24].
Lemma 2.1 (The existence and uniqueness 21, Theorem 2.5). Assume that H:Rn!Rn is a continuous mapping. Then for arbitrary t0P0 and u02Rn, there exists a local solution u(t), t2[t0,
s
) to(2)for somes
>t0. Furthermore, if H is locally Lipschitz continuous at u0, then the solution is unique; if H is Lipschitz continuous inRn, thens
can be extended to1.Remark 2.1. For Eq.(2), if a local solution defined on [t0,
s
) cannot be extended to a local solution on a larger interval [t0,s
1), wheres
1>s
, then it is called amaximal solution, and this interval [t0,s
) is themaximal intervalof existence. It is obvious that an arbitrary local solution has an extension to a maximal one.Lemma 2.2 (21, Theorem 2.6). Let H:Rn!Rnbe a continuous mapping. If u(t) is a maximal solution, and [t0,
s
) is the maximal interval of existence associated with u0ands
<+1, then limt"sku(t)k= +1.For the first-order differential Eq.(2), a pointu2Rnis called anequilibrium pointof(2)ifH(u⁄) = 0. If there is a neigh- borhoodX#Rnofu⁄such thatH(u⁄) = 0 andH(u)–0 for anyu2Xn{u⁄}, thenu⁄is called anisolated equilibrium point. The following are definitions of various stabilities, and related materials can be found in[21,24,27].
Definition 2.1 (Lyapunov stability and Asymptotic stability). Letu(t) be a solution to Eq.(2).
(a) An isolated equilibrium pointu⁄is Lyapunov stable (or stable in the sense of Lyapunov) if for anyu0=u(t0) and
e
> 0, there exists ad> 0 such thatku0uk<d ) kuðtÞ uk<
e
fortPt0:(b) Under the condition that an isolated equilibrium pointu⁄is Lyapunov stable,u⁄is said to be asymptotically stable if it has the property that ifku0u⁄k<d, thenu(t)?u⁄ast?1.
Definition 2.2 (Lyapunov function). Let X#Rn be an open neighborhood of u. A continuously differentiable function g:Rn!Ris said to be a Lyapunov function (or energy function) at the stateu(over the setX) for Eq.(2)if
gðuÞ ¼ 0;
gðuÞ>08u2Xn fug;
dgðuðtÞÞ
dt 60; 8u2X: 8>
<
>:
The following Lemma shows the relationship between stabilities and a Lyapunov function, see[3,8,35].
Lemma 2.3.
(a) An isolated equilibrium point u⁄is Lyapunov stable if there exists a Lyapunov function over some neighborhoodXof u⁄. (b)An isolated equilibrium point u⁄is asymptotically stable if there exists a Lyapunov function over some neighborhoodXof u⁄
that satisfies dgðuðtÞÞ
dt <0; 8u2Xn fug:
Definition 2.3 (Exponential stability). An isolated equilibrium pointu⁄is exponentially stable for Eq.(2)if there existx< 0,
j
> 0,d> 0 such that arbitrary solutionu(t) to Eq.(2), with the initial conditionu(t0) =u0,ku0u⁄k< d, is defined on [0,1) and satisfieskuðtÞ uk6
j
extkuðt0Þ uk; tPt0:From the above definitions, it is obvious that exponential stability is asymptotic stable.
3. NR neural network model
This section shows how the dynamic system in this study was formed. As mentioned previously, the key steps in the neu- ral network method lie in constructing the dynamic system and Lyapunov function. To this end, we first look into the KKT conditions of the problem(1)which are presented as below:
rfðxÞ ATyþrgðxÞz¼0;
z2K; gðxÞ 2K; zTgðxÞ ¼0;
Axb¼0;
8>
<
>: ð3Þ
wherey2Rm,rg(x) denotes the gradient matrix ofg. According to the KKT condition, it is well known that if the problem(1) satisfies Slater’s condition, which means there exists a strictly feasible point for(1), i.e., there exists anx2Rnsuch that gðxÞ 2intðKÞandAx=b. Thenx⁄is a solution of the problem(1)if and only if there existy⁄,z⁄such that (x⁄,y⁄,z⁄) satisfies the KKT conditions(3). Hence, we assume that the problem(1)satisfies the Slater’s condition in this paper.
The following paragraphs provide a brief review of particular properties of the spectral factorization with respect to a sec- ond-order cone, which will be used in the subsequent analysis. Spectral factorization is one of the basic concepts in Jordan algebra. For more details, see[5,11,25].
For any vectorz¼ ðz1;z2Þ 2RRl1ðlP2Þ, its spectral factorization with respect to the second-order coneKis defined as z¼k1e1þk2e2;
whereki=z1+ (1)ikz2k, (i= 1, 2) are the spectral values ofz, and
ei¼
1
2ð1;ð1Þi zkzi
ikÞ; z2–0
1
2ð1;ð1ÞiwÞ; z2¼0 (
withw2Rl1such thatkwk= 1. The termse1,e2are called the spectral vectors ofz. The spectral values ofzand the vectorz have the following properties: for anyz2Rl, there havek16k2and
k1P0 () z2K:
Now we review the concept of metric projection ontoK. For arbitrary elementz2Rl, themetric projectionofzontoKis de- noted byPKðzÞand defined as
PKðzÞ:¼arg min
w2Kkzwk:
Combining the spectral decomposition ofzwith the metric projection ofzontoKyields the expression of metric projection PKðzÞin[11]:
PKðzÞ ¼maxf0;k1ge1þmaxf0;k2ge2:
The projection functionPKhas the following property, which is called the Projection Theorem (see[2]).
Lemma 3.1.LetXbe a closed convex set ofRn. Then, for all x; y2Rnand any z2X,
ðxPXðxÞÞTðPXðxÞ zÞP0 and kPXðxÞ PXðyÞk6kxyk:
Given the definition of the projection, supposez+denotes the metric projectionPKðzÞofz2RlontoK. Then, the natural residual (NR) function is given as follows[11]:
UNRðx;yÞ :¼ x ðxyÞþ 8x; y2Rl:
The NR function is a popular SOC-complementarity function, i.e., UNRðx;yÞ ¼0()x2K; y2Kandhx;yi ¼0:
Because of the non-differentiability ofUNR, we consider a class of smoothed NR complementarity function. To this end, we employ a continuously differentiable convex functiong^:R!Rsuch that
a!1limgðaÞ ¼^ 0; lim
a!1ð^gðaÞ aÞ ¼0 and 0<^g0ðaÞ<1: ð4Þ
What kind of functions satisfies the condition(4)? Here we present two examples:
^gðaÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2þ4
p þa
2 and ^gðaÞ ¼lnðeaþ1Þ:
Supposez=k1e1+k2e2, wherekiandei(i= 1, 2) are the spectral values and spectral vectors ofz, respectively. By applying the function^g, we define the following function:
PlðzÞ:¼
l
^g k1l
e1þl
^g k2l
e2: ð5ÞFukushima et al.[11]show thatPlis smooth for any
l
> 0; moreoverPlis a smoothing function of the projectionPK, i.e., liml#0Pl¼PK. Hence, a smoothed NR complementarity function is given in the form ofUlðx;yÞ:¼xPlðxyÞ:
In particular, from[11, Proposition 5.1], there exists a positive constant
c
> 0 such that kUlðx;yÞ UNRðx;yÞk6cl
for any
l
> 0 andðx;yÞ 2RnRn.Now we look into the KKT conditions(3)of the problem(1). Let
Lðx;y;zÞ ¼rfðxÞ ATyþrgðxÞz; HðuÞ :¼
l
Axb Lðx;y;zÞ Ulðz;gðxÞÞ 2
66 64
3 77 75
and
WlðuÞ :¼ 1
2kHðuÞk2¼1
2kUlðz;gðxÞÞk2þ1
2kLðx;y;zÞk2þ1
2kAxbk2þ1 2
l
2;whereu¼ ð
l
;xT;yT;zTÞT2RþRnRmRl. It is known thatWl(u) serves as a smoothing function of the merit function WNRwhich means the KKT conditions(3)are shown to be equivalent to the following unconstrained minimization problem via the merit function approach:minWlðuÞ :¼ 1
2kHðuÞk2: ð6Þ
Theorem 3.1.
(a) Let Plbe defined by(5).Then,rPl(z) and IrPl(z) are positive definite for any
l
>0 and z2Rl.(b)Let Wl be defined as in (6). Then, the smoothed merit function Wl is continuously differentiable everywhere with rWl(u) =rH(u)H(u) where
rHðuÞ ¼
1 0 0 @PlðzþgðxÞÞ@l T
0 AT r2fðxÞ þr2g1ðxÞ þ þr2glðxÞ rxPlðzþgðxÞÞ
0 0 A 0
0 0 rgðxÞT IrzPlðzþgðxÞÞ
2 66 66 64
3 77 77 75
: ð7Þ
Proof. Form the proof of[16, Proposition 3.1], it is clear thatrPl(z) andIrPl(z) are positive definite for any
l
> 0 and z2Rl. With the help of the definition of the smoothed merit functionWl, part (b) easily follows from the chain rule. hIn light of the main ideas for constructing artificial neural networks (see[6]for details), we establish a specific first order ordinary differential equation, i.e., an artificial neural network. More specifically, based on the gradient of the objective func- tionWlin minimization problem(6), we propose the neural network for solving the KKT system(3)of nonlinear SOCP(1) with the following differential equation:
duðtÞ
dt ¼
q
rWlðuÞ; uðt0Þ ¼u0; ð8Þwhere
q
> 0 is a time scaling factor. In fact, ifs
=q
t, thenduðtÞdt ¼q
duðdssÞ. Hence, it follows from(8)thatduðdssÞ¼ rWlðuÞ. In view of this, for simplicity and convenience, we setq
= 1 in this paper. Indeed, the dynamic system(8)can be realized by an archi- tecture with the cone projection function shown inFig. 1. Moreover, the architecture of this artificial neural network is cat- egorized as a ‘‘recurrent’’ neural network according to the classifications of artificial neural networks as in[6, Chapter 2.3.1].The circuit for (8)requiresn+m+l+ 1 integrators,nprocessors forrf(x),l processors forg(x),ln processors forrg(x), (l+ 1)2nprocessors forr2fðxÞ þPl
i¼1r2giðxÞ, 1 processor forUl, 1 processor for@Pl@l;nprocessors forrxPl,lprocessors for rzPl,n2+ 4mn+ 3ln+l2+lconnection weights and some summers.
4. Stability analysis
In this section, in order to study the stability issues of the proposed neural network(8)for solving the problem(1), we first make an assumption that will be required in our subsequent analysis.
Assumption 4.1.
(a) The problem(1)satisfies the Slater’s condition.
(b) The matrixr2f(x) +r2g1(x) + +r2gl(x) is positive definite for eachx.
Here we say a few words aboutAssumption 4.1(a and b). The Slater’s condition is a standard condition that is widely used in optimization field.Assumption 4.1(b) seems stringent at first glance. Indeed, sincefandgi’s are two order continuously differentiable and convex functions onRn, if there exists at least one function which is strictly convex among these functions, thenAssumption 4.1(b) is guaranteed.
Lemma 4.1.
(a) For any u, we have
kHðuÞ HðuÞ VðuuÞk ¼oðkuukÞ for u!uandV2@HðuÞ where@H(u) denotes the Clarke generalized Jacobian at u.
(b) UnderAssumption 4.1,rH(u)Tis nonsingular for any u¼ ð
l
;x;y;zÞ 2RþþRnRmRl, whereRþþ denotes the set {l
jl
>0}.(c) UnderAssumption4.1and V2@P0(w) being a positive definite matrix where@P0(w) denotes the Clarke generalized Jacobian of the project function P at w, there has
T2@HðuÞ ¼
1 0 0 @PlðzþgðxÞÞ@l T
jl¼0
0 AT r2fðxÞ þr2g1ðxÞ þ þr2glðxÞ VTrgðxÞ
0 0 A 0
0 0 rgðxÞT IV
2 66 66 64
3 77 77 75
jV2@P0ðWÞ 8>
>>
>>
<
>>
>>
>:
9>
>>
>>
=
>>
>>
>;
is nonsingular for any u¼ ð0;x;y;zÞ 2 f0g RnRmRl. (d) Wl(u(t)) is nonincreasing with respect to t.
Proof. (a) This result follows directly from the definition of semismoothness ofH, see[26]for more details.
(b) From the expression ofrH(u) inTheorem 3.1, it follows thatrH(u)Tis nonsingular if and only if the following matrix Fig. 1.Block diagram of the proposed neural network with smoothed NR function.
M:¼
A 0 0
r2fðxÞ þr2g1ðxÞ þ þr2glðxÞ AT rgðxÞ rxPlðzþgðxÞÞT 0 ðIrzPlðzþgðxÞÞÞT 2
64
3 75
is nonsingular. Suppose
v
¼ ðx;y;zÞ 2RnRmRl. To show the nonsingularity ofM, it is enough to prove that Mv
¼0 ) x¼0; y¼0 and z¼0:BecauserxPl(z+g(x))T=rPl(w)Trg(x)T, wherew¼zþgðxÞ 2Rl, fromMv= 0, we have
Ax¼0; ðr2fðxÞ þr2g1ðxÞ þ þr2glðxÞÞxATyþrgðxÞz¼0 ð9Þ and
rPlðwÞTrgðxÞTxþ ðIrPlðwÞÞTz¼0: ð10Þ
From(9), it follows that
xTðr2fðxÞ þr2g1ðxÞ þ þr2glðxÞÞxþ ðrgðxÞTxÞTz¼0: ð11Þ Moveover, Eq.(10)andTheorem 3.1yield
rgðxÞTx¼ ðrPlðwÞTÞ1ðIrPlðwÞÞTz: ð12Þ Combining(11) and (12)andTheorem 3.1, under the condition ofAssumption 4.1, it is not hard to obtain thatx= 0 andz= 0.
By looking at Eq.(9)again, sinceAis full row rank, we havey= 0. Therefore,rH(u)Tis nonsingular.
(c) The proof of Part (c) is similar to that of Part (b), in which the only option is to replacerPl(w) withV2@P0(w).
(d) According to the definition ofWl(u(t)) and Eq.(8), it is clear that dWlðuðtÞÞ
dt ¼rWlðuðtÞÞduðtÞ
dt ¼
q
krWlðuðtÞÞk260:Consequently,Wl(u(t)) is nonincreasing with respect tot. h
Proposition 4.1. Assume thatrH(u) is nonsingular for any u2RþRnRmRl. Then,
(a) (x⁄, y⁄, z⁄) satisfies the KKT conditions(3)if and only if (0, x⁄, y⁄, z⁄) is an equilibrium point of the neural network(8);
(b)Under the Slater’s condition, x⁄is a solution to the problem(1)if and only if (0, x⁄, y⁄, z⁄) is an equilibrium point of the neural network(8).
Proof. (a) BecauseU0=UNRwhen
l
= 0, it follows that (x⁄,y⁄,z⁄) satisfies the KKT conditions(3)if and only ifH(u⁄) = 0, whereu⁄= (0,x⁄,y⁄,z⁄)T. SincerH(u) is nonsingular, we have thatH(u⁄) = 0 if and only ifrWl(u⁄) =rH(u⁄)TH(u⁄) = 0. Thus, the desired result follows.(b) Under the Slater’s condition, it is well known thatx⁄is a solution of the problem(1)if and only if there existy⁄andz⁄ such that (x⁄,y⁄,z⁄) satisfying the KKT conditions(3). Hence, according to Part (a), it follows that (0,x⁄,y⁄,z⁄) is an equilibrium point of the neural network(8). h
The next result addresses the existence and uniqueness of the solution trajectory of the neural network(8).
Theorem 4.1.
(a) For any initial point u0= u(t0), there exists a unique continuously maximal solution u(t) with t2[t0,
s
) for the neural net- work(8),where [t0,s
) is the maximal interval of existence.(b)If the level setLðu0Þ :¼ fujWlðuÞ6Wlðu0Þgis bounded, then
s
can be extended to +1.Proof. This proof is exactly the same as the proof of[27, Proposition 3.4], and therefore omitted here. h
Theorem 4.2. Assume thatrH(u) is nonsingular and that u⁄is an isolated equilibrium point of the neural network(8).Then the solution of the neural network(8)with any initial point u0is Lyapunov stable.
Proof. FromLemma 2.3, we only need to argue that there exists a Lyapunov function over some neighborhoodXofu⁄. Now, we consider the smoothed merit function
WlðuÞ ¼1 2kHðuÞk2:
Sinceu⁄is an isolated equilibrium point of(8), there is a neighborhoodXofu⁄such that rWlðuÞ ¼0 and rWlðuðtÞÞ–0; 8uðtÞ 2Xn fug:
By the nonsingularity ofrH(u) and the definition ofWl, it is easy to obtain thatWl(u⁄) = 0. From the definition ofWl, we claim thatWl(u(t)) > 0 for anyu(t)2Xn{u⁄}, whereXis a neighborhood ofu⁄. Suppose not, namely,Wl(u(t)) = 0. It follows thatH(u(t)) = 0. Then, we haverWl(u(t)) = 0 which contradicts with the assumption thatu⁄is an isolated equilibrium point of(8). Thus,Wl(u(t)) > 0 for anyu(t)2Xn{u⁄}. Furthermore, by the proof ofLemma 4.1(d), we know that for anyu(t)2X
dWlðuðtÞÞ
dt ¼rWlðuðtÞÞduðtÞ
dt ¼
q
krWlðuðtÞÞk260: ð13ÞConsequently, the functionWlis a Lyapunov function overX. This implies thatu⁄is Lyapunov stable for the neural network (8). h
Theorem 4.3. Assume thatrH(u) is nonsingular and that u⁄is an isolated equilibrium point of the neural network(8).Then u⁄is asymptotically stable for neural network(8).
Proof. From the proof ofTheorem 4.2, we consider again the Lyapunov functionWl. ByLemma 2.3again, we only need to verify that the Lyapunov functionWlover some neighborhoodXofu⁄satisfies
dWlðuðtÞÞ
dt <0; 8uðtÞ 2Xn fug: ð14Þ
In fact, by using(13)and the definition of the isolated equilibrium point, it is not hard to check that the Eq.(14)is true.
Hence,u⁄is asymptotically stable. h
Theorem 4.4.Assume that u⁄is an isolated equilibrium point of the neural network (8). If rH(u)Tis nonsingular for any u¼ ð
l
;x;y;zÞ 2RþRnRmRl, then u⁄is exponentially stable for the neural network(8).Proof. From the definition ofH(u), we know thatHis semismooth. Hence, byLemma 4.1, we have
HðuÞ ¼HðuÞ þrHðuðtÞÞTðuuÞ þoðkuukÞ; 8u2Xn fug; ð15Þ whererH(u(t))T2@H(u(t)) andXis a neighborhood ofu⁄. Now, we let
gðuðtÞÞ ¼ kuðtÞ uk2; t2 ½t0;1Þ:
Then, we have dgðuðtÞÞ
dt ¼2ðuðtÞ uÞTduðtÞ
dt ¼ 2
q
ðuðtÞ uÞTrWlðuðtÞÞ ¼ 2q
ðuðtÞ uÞTrHðuÞHðuÞ: ð16Þ Substituting Eq.(15)into Eq.(16)yieldsdgðuðtÞÞ
dt ¼ 2
q
ðuðtÞ uÞTrHðuðtÞÞðHðuÞ þrHðuðtÞÞTðuðtÞ uÞ þoðkuðtÞ ukÞÞ¼ 2
q
ðuðtÞ uÞTrHðuðtÞÞrHðuðtÞÞTðuðtÞ uÞ þoðkuðtÞ uk2Þ:BecauserH(u) andrH(u)Tare nonsingular, we claim that there exists an
j
> 0 such thatðuðtÞ uÞTrHðuÞrHðuÞTðuðtÞ uÞP
j
kuðtÞ uk2: ð17Þ Otherwise, if (u(t)u⁄)TrH(u(t))rH(u(t))T(u(t)u⁄) = 0, it implies thatrHðuðtÞÞTðuðtÞ uÞ ¼0:
Indeed, from the nonsingularity ofH(u), we haveu(t)u⁄= 0, i.e.,u(t) =u⁄which contradicts with the assumption ofu⁄being an isolated equilibrium point. Consequently, there exists an
j
> 0 such that(17)holds. Moreover, foro(ku(t)u⁄k2), there ise
> 0 such thato(ku(t)u⁄k2)6e
ku(t)u⁄k2. Hence, dgðuðtÞÞdt 6ð2
qj
þe
ÞkuðtÞ uk2¼ ð2qj
þe
ÞgðuðtÞÞ:This implies
gðuðtÞÞ6eð2qjþeÞtgðuðt0ÞÞ
which means
kuðtÞ uk6eqjþ2ekuðt0Þ uk:
Thus,u⁄is exponentially stable for the neural network(8). h
To show the contribution of this paper, we present the stability comparisons of neural networks considered in the current paper,[20,27]inTable 1. More convergence comparisons will be presented in the next section. Generally speaking, we estab- lish three stabilities for the proposed neural network, whereas not all three stabilities for the similar neural networks studied in[20,27]are guaranteed. Why do we choose to investigate the proposed neural network? Indeed, in[20], two neural net- works based on NR function and FB function are considered which does not reach exponential stability. Our target optimi- zation problem is a wider class than the one studied in[20]. In contrast, the smoothed FB has good performance as is shown in[27], but not all the three stabilities are established even though exponential stability is good enough. In light of these observations, we decide to look into the smoothed NR function for our problem which turns out to have better theoretical results. We summarize their differences in problems format, dynamical model, and stability issues inTable 1.
5. Numerical examples
In order to demonstrate the effectiveness of the proposed neural network, in this section we test several examples for our neural network(8). The numerical implementation is coded by Matlab 7.0 and the ordinary differential equation solver adopted here isode23, which uses the Ruge–Kutta (2; 3) formula. As mentioned earlier, the parameter
q
is set to be 1.How is
l
chosen initially? FromTheorem 4.2in previous section, we know the solution converges with any initial point, we set initiall
= 1 in the codes (and of coursel
?0, as seen in the trajectory behavior).Example 5.1. Consider the following nonlinear convex programming problem:
min eðx13Þ2þx22þðx31Þ2þðx42Þ2þðx5þ1Þ2 s:t: x2K5;
Here, we denotefðxÞ :¼ eðx13Þ2þx22þðx31Þ2þðx42Þ2þðx5þ1Þ2andg(x) =x. Then, we compute
Lðx;zÞ ¼rfðxÞ þrgðxÞz¼2fðxÞ x13
x2
x31 x42 x5þ1 2 66 66 66 4
3 77 77 77 5
z1
z2
z3
z4
z5
2 66 66 66 4
3 77 77 77 5
: ð18Þ
Moreover, letx :¼ ðx1;xÞ 2RR4andz :¼ ðz1;zÞ 2RR4. Then, the elementzxcan be expressed as zx :¼ k1e1þk2e2
whereki¼z1x1þ ð1Þikzxkandei¼121;ð1Þikzzxxk
ði¼1;2Þifzx–0, otherwiseei¼12ð1;ð1ÞiwÞwithwbeing any vector inR4satisfyingkwk= 1. This implies that
Ulðz;gðxÞÞ ¼zPlðzþgðxÞÞ ¼z
l
^g k1l
e1þl
^g k2l
e2 ð19Þwith^gðaÞ ¼ ffiffiffiffiffiffiffiffi
a2þ4
p
þa
2 or^gðaÞ ¼lnðeaþ1Þ. Therefore, by Eqs.(18) and (19), we obtain the expression ofH(u) as follows:
HðuÞ ¼
l
Lðx;zÞ Ulðz;gðxÞÞ 2
64
3 75:
Table 1
Stability comparisons of neural networks considered in current paper,[20,27].
Current paper [20] [27]
Problem min fðxÞ s:t: Ax¼b
gðxÞ 2K
min fðxÞ s:t: Ax¼b
x2K
hFðxÞ;yxiP0; 8y2C C¼ fxjhðxÞ ¼0;gðxÞ 2Kg
ODE Based on smoothed NR-function Based on NR-function and FB-function Based on NR-function and smoothed FB-function
Stability Lyapunov (smoothed NR) Lyapunov (NR) Lyapunov (NR)
Asymptotical (smoothed NR) Lyapunov (FB) Asymptotical (NR)
Exponential (smoothed NR) asymptotical (FB) Exponential (smoothed FB)
This problem has an optimal solutionx⁄= (3, 0, 1, 2,1)T. We use the proposed neural network to solve the above problem whose trajectories are depicted inFig. 2. All simulation results show that the state trajectories with any initial point are al- ways convergent to an optimal solution of the above problemx⁄.
Example 5.2. Consider the following nonlinear second-order cone programming problem:
min fðxÞ ¼x21þ2x22þ2x1x210x112x2
s:t: gðxÞ ¼ 8x1þ3x2
3x212x1þ2x2x22
2K2: For this example, we compute that
Lðx;zÞ ¼rfðxÞ þrgðxÞz¼ 2x12x210 4x2þ2x112
z12ðx1þ1Þz2
3z3þ2ð1x2Þz2
: ð20Þ
Since
zgðxÞ ¼ z18þx13x2
z23þx21þ2x12x2þx22
;
the vectorzg(x) can be expressed as zx :¼ k1e1þk2e2
whereki¼z18þx13x2þ ð1Þijz23þx21þ2x12x2þx22jand ei¼1
2 1;ð1Þi z23þx21þ2x12x2þx22 jz23þx21þ2x12x2þx22j
ði¼1;2Þ; ifz23þx21þ2x12x2þx22–0;
otherwise,ei¼12ð1;ð1ÞiwÞwithwbeing any element inRsatisfyingjwj= 1. This implies that Ulðz;gðxÞÞ ¼zPlðzþgðxÞÞ ¼z
l
g^ k1l
e1þl
g^ k2l
e2 ð21Þwith^gðaÞ ¼ ffiffiffiffiffiffiffiffi
a2þ4
p
þa
2 or^gðaÞ ¼lnðeaþ1Þ. Therefore, by(20) and (21), we obtain the expression ofH(u) as follows:
HðuÞ ¼
l
Lðx;zÞ Ulðz;gðxÞÞ 2
64
3 75:
This problem has an approximate solutionx⁄= (2.8308, 1.6375)T. Note that the objective function is convex and the Hes- sian matrixr2f(x) is positive definite. Using the proposed neural network in this paper, we can easily obtain the approximate solutionx⁄of the above problem, seeFig. 3.
0 1 2 3 4 5 6
−1.5
−1
−0.5 0 0.5 1 1.5 2 2.5 3 3.5
Time (ms)
Trajectories of x(t)
x1
x2 x3 x4
x5
Fig. 2.Transient behavior of the neural network with the smoothed NR function inExample 5.1.
Example 5.3. Consider the following nonlinear convex program with second-order cone constraints[18]:
min eðx1x3Þþ3ð2x1x2Þ4þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ ð3x2þ5x3Þ2 q
s:t: gðxÞ ¼Axþb2K2; x2K3
where
A :¼ 4 6 3
1 7 5
;b :¼ 1
2
:
For this example,fðxÞ :¼ eðx1x3Þþ3ð2x1x2Þ4þ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ ð3x2þ5x3Þ2 q
, from which we have
Lðx;y;zÞ ¼rfðxÞ þrgðxÞyrxz¼
eðx1x3Þþ24ð2x1x2Þ3 12ð2x1x2Þ3þ 3ð3xffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2þ5x3Þ
1þð3x2þ5x3Þ2
p eðx1x3Þþ 5ð3xffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2þ5x3Þ
1þð3x2þ5x3Þ2
p 2
66 64
3 77 75
4y1y2
6y1þ7y2
3y15y2 2
64
3 75
z1
z2
z3
2 64
3
75: ð22Þ
Since
yþgðxÞ zx
¼
y14x16x23x3þ1 y2þx17x2þ5x32
z1x1
z2x2
z3x3
2 66 66 66 4
3 77 77 77 5
;
y+g(x) andzxcan be expressed as follows, respectively, yþgðxÞ:¼k1e1þk2e2
and
zx:¼
j
1f1þj
2f2whereki=y14x16x23x3+ 1 + (1)ijy2+x17x2+ 5x32jand ei¼1
2 1;ð1Þi y2þx17x2þ5x32 jy2þx17x2þ5x32j
ði¼1;2Þ if y2þx17x2þ5x32–0;
otherwise, ei¼12ð1;ð1ÞiwÞ with w being any element in R satisfying jwj= 1. Moveover, let x :¼ ðx1;xÞ 2RR2 and z :¼ ðz1;zÞ 2RR2. Then, we obtain that
j
i¼z1x1þ ð1Þikzxkandfi¼12ð1;ð1ÞikzzxxkÞði¼1;2Þifzx–0; otherwise fi¼12ð1;ð1Þit
Þwithtbeing any vector inR2satisfyingktk= 1. This implies that0 5 10 15 20 25 30
−0.5 0 0.5 1 1.5 2 2.5 3
Time (ms)
Trajectories of x(t)
x1
x2
Fig. 3.Transient behavior of the neural network with the smoothed NR function inExample 5.2.
UlðÞ ¼ yPlðyþgðxÞÞ zPlðzxÞ
¼
y
l
^g kl1 e1þl
^g kl2 e2z
l
^g jl1 f1þl
^g jl2 f22 64
3
75 ð23Þ
with^gðaÞ ¼ ffiffiffiffiffiffiffiffi
a2þ4
p
þa
2 or^gðaÞ ¼lnðeaþ1Þ. Therefore, by(22) and (23), we obtain the expression ofH(u) as below:
HðuÞ ¼
l
Lðx;y;zÞ UlðÞ 2 64
3 75:
The approximate solution to this problem isx⁄= (0.2324,0.07309, 0.2206)T. The trajectories are depicted inFig. 4. We want to point out on thing:Assumption 4.1(a and b) are not both satisfied in this example. More specifically, for this example, the Assumption 4.1(a) is satisfied, which is obvious. However,r2f(x) +r2g1(x) + +r2gl(x) is not positive semidefinite for each x. To see this, we compute
r2fðxÞ þr2g1ðxÞ þ þr2glðxÞ ¼r2fðxÞ ¼
eðx1x3Þþ144ð2x1x2Þ2 72ð2x1x2Þ2 eðx1x3Þ 72ð2x1x2Þ2 36ð2x1x2Þ2þ 9
ð1þð3x2þ5x3Þ2Þ 3 2
15 ð1þð3x2þ5x3Þ2Þ
3 2
eðx1x3Þ 15
ð1þð3x2þ5x3Þ2Þ32
eðx1x3Þþ 25
ð1þð3x2þ5x3Þ2Þ32
2 66 66 4
3 77 77 5;
which is not positive semidefinite when 2x1x2= 0 (because the determinant equals zero). Hence,H(u) is not guaranteed to be nonsingular and all the theorems in Section4do not apply for this example. Nonetheless, the solution trajectory does converge as depicted inFig. 4. This phenomenon also occurs when it is solved by the second neural network studied in [27](the stability is not guaranteed theoretically, but the solution trajectory does converges).
In addition, forExample 5.3, we also do comparisons among three neural networks based on FB function (considered in [20]), smoothed NR function (considered in this paper), and smoothed FB function (considered in [27]), respectively.
AlthoughExample 5.3can be solved by all three neural networks, the neural network based on FB function does not behave as good as the other two neural networks, seeFig. 5.
Example 5.4. Consider the following nonlinear second-order cone programming problem:
min fðxÞ ¼ex1x3þ3ðx1þx2Þ2
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1þ ð2x2x3Þ2 q
þ12x24þ12x25 s:t: hðxÞ ¼ 24:51x1þ58x216:67x3x43x5þ11¼0
g1ðxÞ ¼
3x31þ2x2x3þ5x23 5x31þ4x22x3þ10x33
x3
0 B@
1 CA2K3;
g2ðxÞ ¼ x4
3x5
2K2:
0 5 10 15 20 25 30
−0.2
−0.15
−0.1
−0.05 0 0.05 0.1 0.15 0.2 0.25 0.3
Time (ms)
Trajectories of x(t)
x1
x2 x3
Fig. 4.Transient behavior of the neural network with the smoothed NR function inExample 5.3.
For this example, we compute
Lðx;y;zÞ ¼rfðxÞ þ rg1ðxÞy rg2ðxÞz
¼
x3eðx1x3Þþ6ðx1þx2Þ 6ðx1þx2Þ þ 2ð2xffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2x3Þ
1þð2x2x3Þ2
p x1eðx1x3Þþ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2x2x3
1þð2x2x3Þ2
p x4
x5
2 66 66 66 66 4
3 77 77 77 77 5
9x21y115x21y2 2y1þ4y2
ð10x31Þy1þ ð30x23Þy2þy3
z1
3z2
2 66 66 66 4
3 77 77 77 5
: ð24Þ
Moreover, we know
yþg1ðxÞ zþg2ðxÞ
¼
y13x312x2þx35x23 y2þ5x314x2þ2x310x33
y3x3
z1x4
z2x5
2 66 66 66 4
3 77 77 77 5 :
Letyþg1ðxÞ :¼ ðu;uÞ 2 RR2andzþg2ðxÞ :¼ ð
v
;v
Þ 2RR, whereu¼y13x312x2þx35x23, u¼ y2þ5x314x2þ2x310x33y3x3
" #
and
v
¼z1x4;v
¼z2x5:Then,y+g1(x) andz+g2(x) can be expressed as follows:
yþg1ðxÞ :¼ k1e1þk2e2
and
zþg2ðxÞ :¼
j
1f1þj
2f2whereki¼uþ ð1Þikuk; ei¼121;ð1Þikuuk
and
j
i¼v
þ ð1Þijv
j; fi¼121;ð1Þijvvj(i= 1, 2) ifu–0 and
v
–0, otherwise ei¼12ð1;ð1ÞiwÞwithwbeing any element inR2satisfyingkwk= 1, andfi¼12ð1;ð1Þit
Þwithtbeing any vector inRsatis- fyingjtj= 1. This implies thatUlðÞ ¼ yPlðyþg1ðxÞÞ zPlðzþg2ðxÞÞ
¼
y
l
^g kl1 e1þl
^gðkl2Þe2z
l
^g jl1 f1þl
g^ jl2 f22 64
3
75 ð25Þ
0 20 40 60 80 100
10−6 10−5 10−4 10−3 10−2 10−1 100
Time (ms)
Norm of error
Smoothed NR Smoothed FB FB
Fig. 5.Comparisons of three neural networks based on the FB function, smoothed NR function, and smoothed FB function inExample 5.3.
with^gðaÞ ¼ ffiffiffiffiffiffiffiffi
a2þ4
p þa
2 or^gðaÞ ¼lnðeaþ1Þ. Consequently, by Eqs.(24) and (25), we obtain the expression ofH(u) as follows:
HðuÞ ¼
l
Lðx;y;zÞ UlðÞ 2 64
3 75:
This problem has an approximate solutionx⁄= (0.0903,0.0449, 0.6366, 0.0001, 0)TandFig. 6displays the trajectories obtained by using the proposed new neural network. All simulation results show that the state trajectory with any initial point are always convergent to the solutionx⁄. As observed, the neural network with the smoothed NR function has a fast convergence rate.
Furthermore, we also do comparisons between two neural networks based on smoothed NR function (considered in this paper) and smoothed FB function (considered in[27]) for Example 5.5. Note that Example 5.5 cannot be solved by the neural networks studied in[20]. Both neural networks possess exponential stability as shown inTable 1, which means the solution trajectories have the same order of convergence. This phenomenon is reflected inFig. 7.
6. Conclusion
In this paper, we have studied a neural network approach for solving general nonlinear convex programs with second- order cone constraints. The proposed neural network is based on the gradient of the merit function derived from smoothed
0 20 40 60 80 100 120 140 160
−0.2
−0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Time (ms)
Trajectories of x(t)
x1 x2 x3
x4, x5
Fig. 6.Transient behavior of the neural network with the smoothed NR function inExample 5.4.
0 50 100 150 200
10−6 10−5 10−4 10−3 10−2 10−1 100
Time (ms)
Norm of error
Smoothed NR Smoothed FB
Fig. 7.Comparisons of two neural network based on smoothed NR function and smoothed FB function inExample 5.4.
NR complementarity function. In particular, fromDefinition 2.1andLemma 2.3, we know that there exists a stable equilib- rium pointu⁄as long as there exists a Lyapunov function over some neighborhood ofu⁄, and the stable equilibrium pointu⁄is exactly the solution of our considering problem. In addition to studying the existence and convergence of the solution tra- jectory of the neural network, this paper shows that the merit function is a Lyapunov function. Furthermore, the equilibrium point of the neural network(8)is stable, including the stability in the sense of Lyapunov, asymptotic stability, and exponen- tial stability under suitable conditions.
Indeed, this paper can be viewed as a follow-up of[20,27]because we establish three stabilities for the proposed neural network, but not all three stabilities for the similar neural networks studied in[20,27]are guaranteed. The numerical exper- iments presented in this study demonstrate the efficiency of the proposed neural network.
Acknowledgment
The authors are grateful to the reviewers for their valuable suggestions, which have considerably improved the paper a lot.
Appendix A
For the functionPl(z) defined as in(5), the following Lemma provides the gradient matrix ofPl(z), which will be used in numerical computation and coding.
Lemma 6.1. For any z¼ ðz1;zT2ÞT2Rn, the gradient matrix of Pl(z) is written as
rPlðzÞ ¼
^g0 zl1 I ifz2¼0;
bl
clzT 2 kz2k clz2
kz2k alIþ ðblalÞz2z
T 2 kz2k2
2 64
3
75 ifz2–0;
8>
>>
><
>>
>>
: where
al¼
^g kl2 ^g kl1
k2
lkl1 ; bl¼1
2 g^0 k2
l
þg^0k1
l
; cl¼1
2 g^0 k2
l
^g0 k1
l
; and I denotes the identity matrix.
Proof. See Proposition 5.2 in[11]. h
References
[1]F. Alizadeh, D. Goldfarb, Second-order cone programming, Mathematical Programming 95 (2003) 3–52.
[2]D.P. Bertsekas, Nonlinear Programming, Athena Scientific, 1995.
[3]Y.-H. Chen, S.-C. Fang, Solving convex programming problems with equality constraints by neural networks, Computers and Mathematics with Applications 36 (1998) 41–68.
[4]J.-S. Chen, C.-H. Ko, S.-H. Pan, A neural network based on the generalized Fischer–Burmeister function for nonlinear complementarity problems, Information Sciences 180 (2010) 697–711.
[5]J.-S. Chen, P. Tseng, An unconstrained smooth minimization reformulation of the second-order cone complementarity problem, Mathematical Programming 104 (2005) 293–327.
[6]A. Cichocki, R. Unbehauen, Neural Networks for Optimization and Signal Processing, John wiley, New York, 1993.
[7]C. Dang, Y. Leung, X. Gao, K. Chen, Neural networks for nonlinear and mixed complementarity problems and their applications, Neural Networks 17 (2004) 271–283.
[8]S. Effati, A. Ghomashi, A.R. Nazemi, Application of projection neural network in solving convex programming problems, Applied Mathematics and Computation 188 (2007) 1103–1114.
[9]S. Effati, A.R. Nazemi, Neural network and its application for solving linear and quadratic programming problems, Applied Mathematics and Computation 172 (2006) 305–331.
[10] F. Facchinei, J. Pang, Finite-dimensional Variational Inequalities and Complementarity Problems, Springer, New York, 2003.
[11]M. Fukushima, Z.-Q. Luo, P. Tseng, Smoothing functions for second-order-cone complementarity problems, SIAM Journal on Optimization 12 (2002) 436–460.
[12]Q. Han, L.-Z. Liao, H. Qi, L. Qi, Stability analysis of gradient-based neural networks for optimization problems, Journal of Global Optimization 19 (2001) 363–381.
[13]J.J. Hopfield, D.W. Tank, Neural computation of decision in optimization problems, Biological Cybernetics 52 (1985) 141–152.