A linearly convergent derivative-free descent method for the second-order cone complementarity problem

(1)

Vol. 59, No. 8, November 2010, 1173–1197

A linearly convergent derivative-free descent method for the second-order cone complementarity problem

Shaohua Pan^aand Jein-Shan Chen^b*

aSchool of Mathematical Sciences, South China University of Technology, Guangzhou 510640, China;^bDepartment of Mathematics, National Taiwan

Normal University, Taipei 11677, Taiwan

(Received 25 August 2008; final version received 1 May 2009) We consider a class of derivative-free descent methods for solving the second-order cone complementarity problem (SOCCP). The algorithm is based on the Fischer–Burmeister (FB) unconstrained minimization reformulation of the SOCCP, and utilizes a convex combination of the negative partial gradients of the FB merit function FB as the search direction. We establish the global convergence results of the algorithm under monotonicity and the uniform JordanP-property, and show that under strong monotonicity the merit function value sequence generated converges at a linear rate to zero. Particularly, the rate of convergence is dependent on the structure of second-order cones. Numerical comparisons are also made with the limited BFGS method used by Chen and Tseng (An unconstrained smooth minimization reformulation of the second-order cone complementarity problem, Math. Program. 104(2005), pp. 293–327), which confirm the theoretical results and the effectiveness of the algorithm.

Keywords: second-order cone complementarity problem; Fischer–

Burmeister function; descent algorithms; derivative-free methods; linear convergence

1. Introduction

We consider the conic complementarity problem of finding a vector2IRⁿsuch that

2 K, FðÞ 2 K, h,FðÞi ¼0, ð1Þ

where F:IRⁿ!IRⁿ is a mapping assumed to be continuously differentiable throughout this article, and K is the Cartesian product of second-order cones (SOCs). In other words,

K ¼ Kⁿ¹ Kⁿ² Kⁿ^m, ð2Þ wherem,n1,. . .,nm1,n1þ þnm¼n, and

Kⁿⁱ:¼ ðx 1,x2Þ 2IRIRⁿⁱ¹ jx1 kx2k

, ð3Þ

*Corresponding author. Email: jschen@math.ntnu.edu.tw.

ISSN 0233–1934 print/ISSN 1029–4945 online ß2010 Taylor & Francis

DOI: 10.1080/02331930903085359 http://www.informaworld.com

(2)

withkkdenoting the Euclidean norm andK¹denoting the set of non-negative reals IRþ. We will refer to (1)–(2) as the second-order cone complementarity problem (SOCCP).

As a direct extension of the non-linear complementarity problem (NCP), the SOCCP includes as a special case the Karush-Kuhn-Tucker (KKT) system of SOC programming, which has a wide range of applications in engineering design, control, finance, robust optimization and combinatorial optimization; see [1,18] and the references therein. Now there have been various methods proposed for solving the SOCCP, which include the merit function method [5], the smoothing Newton methods [6,10,12], the semismooth Newton methods [16,20], and the interior-point method [25]. We observe that the last three kinds of methods in each iteration involve the solution of a linear system of equations, which makes them unsuitable for handling large-scale SOCCPs. On the contrary, the merit function method [5], based on the Fischer–Burmeister (FB) unconstrained minimization reformulation of the SOCCP, requires much less computation work in each iteration and consequently has a certain potential for solving large-scale SOCCPs.

The FB merit function associated with the coneKⁿ is given by

FBðx,yÞ:¼¹₂k_FBðx,yÞk², ð4Þ

where_FB :IRⁿIRⁿ!IRⁿ is the FB function associated withKⁿ, defined by _FBðx,yÞ:¼ ðx²þy²Þ¹⁼² ðxþyÞ ð5Þ withx²¼xxdenoting the Jordan product ofxand itself,x^1/2being a vector such that (x^1/2)²¼x, and xþy meaning the componentwise addition of vectors. The functions _FBand_FBwere studied in the papers [2,5,10,21], where _FBwas shown in [10] to satisfy

FBðx,yÞ ¼0()x2 Kⁿ, y2 Kⁿ, hx,yi ¼0, ð6Þ

and its continuous differentiability was established by Chen and Tseng [5], and_FB was proved to be strongly semismooth in [21] and [2] via different ways. By equivalence (6), clearly, the SOCCP can be reformulated as an unconstrained minimization problem

min2IRⁿ ^FBðÞ:¼X^m

i¼1

FBði,FiðÞÞ, ð7Þ

where¼(1,. . .,_m),F()¼(F1(),. . .,F_m()) with_i2IRⁿⁱ andF_i:IRⁿ !IRⁿⁱ. The merit function method in [5] was developed by applying the limited BFGS method directly for the minimization reformulation (7). In this article, we propose another merit function method based on the same reformulation, which can be viewed as an extension of the method in [23] for the NCP. Different from the limited BFGS method adopted by Chen and Tseng [5], our method does not exploit the derivative of the mapping F, but utilizes some convex combination of the negative partial gradients of _FB, i.e. the vector of the form r_x _FB ð1Þr_y _FB with 2(0,1), as the search direction. Since the computation of the search direction and the step size does not involve the Jacobian of F, our derivative-free algorithm

(3)

requires less computation work and lower memory in each iteration than the existing methods mentioned above. We show that the algorithm is globally convergent under monotonicity and the uniform Jordan P-property of F, and particularly that the merit function value sequencef _FBð^kÞggenerated converges at a linear rate to zero if Fis strongly monotone. But, unlike the NCP case, the rate of convergence depends on the structure ofK (Remark 5.1 (a)).

The literature on derivative-free methods for solving the NCP is vast; see, for example, [11,15,17,19,24,23]. Nevertheless, to the best of our knowledge, there are no papers to study derivative-free methods for the SOCCP except [3] where a different unconstrained reformulation and a different descent direction were employed, and no rate of convergence result was established. The main difficulty is to extend the growth relation between the FB function and the natural residual function established in [22]

to the SOCCP case. In addition, numerical results were not reported for the above derivative-free methods, so the practical performance of these methods cannot be judged. In this article, we obtain the rate of convergence result for the proposed derivative-free descent algorithm by using the favourable properties of the gradients of the function _FB(Propositions 3.1 and 3.2), as well as compare the performance of the algorithm with that of the limited BFGS method in [5], which indicates that our method is comparable to the limited BFGS method for some test problems.

Throughout this article, IRⁿ denotes the space of n-dimensional real column vectors, and IRⁿ¹ IRⁿ^m is identified with IRⁿ¹^{þ þn}^m. Thus, ðx₁,. . .,x_mÞ 2 IRⁿ¹ IRⁿ^mis viewed as a column vector in IRⁿ¹^þþn^m. The notationImeans an identity matrix of suitable dimension, and intðKⁿÞdenotes the interior ofKⁿ. For any x, y in IRⁿ, we write xKn y if xy2 Kⁿ; and write xKn y if xy2 Kⁿ. For a differentiable mapping F:IRⁿ!IR^m,rFðxÞ 2IR^nm denotes the transposed Jacobian of F at x. For a symmetric matrix A, we write AO (respectively, AO) to mean A is positive semidefinite (respectively, positive definite).

In addition, we use diag(1,. . .,n) to denote a diagonal matrix with 1,. . .,n as the diagonal elements.

2. Preliminaries

This section recalls some background materials that will be used in the subsequent sections. It is known that Kⁿ is a closed convex self-dual cone with non-empty interior

intðKⁿÞ:¼x¼ ðx₁,x₂Þ 2IRIRⁿ¹j x₁4kx₂k :

For anyx¼ ðx₁,x₂Þ,y¼ ðy₁,y₂Þ 2IRIRⁿ¹, we define their Jordan product [8] by xy:¼ ðhx,yi, y1x2þx1y2Þ: ð8Þ The Jordan product, unlike scalar or matrix multiplication, is not associative, which is a main source of complication in the analysis of SOCCP. The identity element under this product is e:¼ ð1, 0,. . ., 0Þ^T2IRⁿ. Given a vector x¼ ðx1,x2Þ 2 IRIRⁿ¹, let

L_x:¼ x1 x^T₂ x2 x1I

,

(4)

which can be viewed as a linear mapping from IRⁿ to IRⁿ withLxy¼xy for any y2IRⁿ. It is easy to verify thatLxforx2intðKⁿÞis invertible with the inverseL¹_x given by

L¹_x ¼ 1 detðxÞ

x1 x^T₂

x2

detðxÞ x₁ Iþ 1

x₁x2x^T₂ 2

4

3

5, ð9Þ

where detðxÞ:¼x²₁ kx2k² denotes the determinant ofx.

We recall from [8,10] that each x¼ ðx1,x2Þ 2IRIRⁿ¹ admits a spectral factorization associated with Kⁿ in the form of x¼1ðxÞ u^ð1Þ_x þ2ðxÞ u^ð2Þ_x , where i(x) andu^ðiÞ_x fori¼1, 2 are the spectral values ofxand the corresponding spectral vectors, defined by

_iðxÞ:¼x₁þ ð1Þⁱkx₂k, u^ðiÞ_x :¼1

21, ð1Þⁱx₂

, ð10Þ

with x₂¼_kx^x²

2k if x₂6¼0, and otherwise x₂ being any vector in IRⁿ¹ satisfying kx2k ¼1. Ifx26¼0, the factorization is unique. The spectral factorization ofxand the matrixLxhave various interesting properties; see [10]. We list several ones that will be used later.

LEMMA 2.1

(a) For any x2IRⁿ,x²¼ ð1ðxÞÞ²u^ð1Þ_x þ ð2ðxÞÞ²u^ð2Þ_x 2 Kⁿ: (b) For any x2 Kⁿ,x¹⁼²¼ ffiffiffiffiffiffiffiffiffiffiffi

1ðxÞ

p u^ð1Þ_x þ ffiffiffiffiffiffiffiffiffiffiffi 2ðxÞ

p u^ð2Þ_x 2 Kⁿ.

(c) x_Kn0()1ðxÞ 0()LxO and x_Kn 0()1ðxÞ40()LxO.

The following lemma is a representation of Problem 7 in [13 p. 468] for the real symmetric matrix case. In view of its importance, we here include its proof.

LEMMA 2.2 Let B,C2IRⁿⁿbe symmetric matrices with BO.Then BþCO if and only if every eigenvalue of CB¹is greater than1.

Proof By Corollary 7.6.5 of [13], there exists a non-singular matrixD2IRⁿⁿsuch thatD^TCD¼diag(1,. . .,n) andD^TBD¼I. Consequently,

CB¹¼ ðD^TÞ¹diagð1,. . .,nÞD¹ ðD^TÞ¹D¹1

¼ ðD^TÞ¹diagð1,. . .,nÞD^T:

This implies that CB¹ is similar to the diagonal matrix diag(1,. . .,n), and therefore1,. . .,nare the eigenvalues ofCB¹including the multiplicities. On the other hand,

BþC¼ ðD^TÞ¹diagð1þ1,. . ., 1þ_nÞD¹,

which means thatBþCOif and only ifi41 for alli¼1, 2,. . .,n. Combining

the two sides, we then obtain the desired result. g

Next, we review the definitions of the monotonicity and the P-property of a mapping.

Definition 2.1 The mappingF¼(F1,. . .,Fm) withFi:IRⁿ!IRⁿⁱ is said to (a) be monotone if, for every,2IRⁿ,h, FðÞ FðÞi 0;

(5)

(b) be strongly monotone if there exists a40 such that, for every,2IRⁿ, h, FðÞ FðÞi kk²;

(c) have the uniform Jordan P-property if there exists a 40 such that, for every¼ ð₁,. . .,_mÞ,¼ ð₁,. . .,_mÞ 2IRⁿ, there exists2{1,2,. . .,m} such that

₂½ðÞ ðFðÞ FðÞÞ kk²;

(d) have the uniform CartesianP-property if there exists a 40 such that, for every¼ ð1,. . .,mÞ,¼ ð1,. . .,mÞ 2IRⁿ, there exist2{1,2,. . .,m} such that

h,FðÞ FðÞi kk²:

From Definition 2.1, clearly, the uniform Cartesian P-property implies the uniform JordanP-property, and ifFis strongly monotone with modulus40, then F has the uniform Jordan P-property and the uniform Cartesian P-property with modulus/m. Also, when Fis continuously differentiable, Fis strongly monotone with modulus40 if and only ifrF() is uniformly positive definite with modulus 40, i.e.

d^TrFðÞdkdk² for all, d2IRⁿ:

In addition, we see that the uniform Jordan P-property does not imply the monotonicity.

Unless otherwise stated, in the subsequent three sections, we assume K ¼ Kⁿ, and all analysis can be carried over to the case whereKhas the Cartesian structure as in (2).

3. Some properties ofw_FB and)_FB

In this section, we present some important properties for the gradient of _FB which play a crucial role in analysing the convergence results of the descent algorithm proposed in the next section. In addition, we establish the coerciveness of_FB under two mild conditions. Throughout this section, for any x¼ ðx1,x2Þ,y¼ ðy1,y2Þ 2IRIRⁿ¹, we write

w¼ ðw1,w2Þ:¼x²þy² and z:¼ ðz₁,z2Þ ¼ ðx²þy²Þ¹⁼²: ð11Þ First, from Propositions 1 and 2 of [5], we know that the function _FB is continuously differentiable everywhere and its gradient is given as in the following lemma.

LEMMA 3.1 The function FB in [4] is continuously differentiable everywhere.

Moreover,rx FBð0, 0Þ ¼ ry FBð0, 0Þ ¼0:If x²þy²2intðKⁿÞ,then rx FBðx,yÞ ¼

LxL¹_z I

_FBðx,yÞ, r_y _FBðx,yÞ ¼

LyL¹_z I

_FBðx,yÞ:

(6)

If x²þy²2=intðKⁿÞ and (x,y)6¼(0,0),then x²₁þy²₁6¼0 and rx FBðx,yÞ ¼ x1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x²₁þy²₁

q 1

!

_FBðx,yÞ,

r_y _FBðx,yÞ ¼ y₁

q 1

!

_FBðx,yÞ:

For the partial gradients r_x _FB and r_y _FB, from [5 Lemma 9] and [4, Theorem 3.1], we readily obtain the following favourable properties whose proofs will be omitted.

PROPOSITION 3.1 The gradientsr_x _FBandr_y _FBof _FBhave the following properties:

(a) hr_x _FBðx,yÞ,r_y _FBðx,yÞi 0for all x,y2IRⁿ,and furthermore, the equality holds if and only if _FBðx,yÞ ¼0.

(b) For all x,y2IRⁿ,rx FBðx,yÞ ¼ ry FBðx,yÞ ¼0if and only if _FBðx,yÞ ¼0.

(c) r FB is globally Lipschitz continuous, i.e. there exists a constant L40 such that

kr_x _FBðx,yÞ rx FBðx, yÞk Lkðx,yÞ ðx, yÞk, kr_y _FBðx,yÞ r_y _FBðx, yÞk Lkðx,yÞ ðx, yÞk:

for allðx,yÞ,ðx, yÞ 2 IRⁿIRⁿ,where L is dependent on the dimension n.

Next we will establish another three important properties for the gradientsr_x _FB and r_y _FB (Proposition 3.2) which are crucial to analyse the convergent results in Sections 4 and 5. To the end, we need the following technical lemmas. The first one is an extension of [5, Lemma 3], which will be used to give a tighter upper bound for LxþyL¹_z .

LEMMA 3.2 For any x¼ ðx1,x2Þ,y¼ ðy1,y2Þ 2IRIRⁿ¹such that w26¼0,we have ðx1þy1Þ þ ð1Þⁱðx2þy2Þ^Tw2

² ðx 2þy2Þ þ ð1Þⁱðx1þy1Þw2²2_iðwÞ ð12Þ for i¼1,2,wherew₂¼w₂=kw₂k.

Proof The first inequality can be easily obtained by expanding the square on both sides and using the Cauchy–Schwartz inequality. We next show that the second inequality holds wheni¼1, which is equivalent to proving the following inequality:

ðx2þy2Þkw2k ðx1þy1Þw2

²21ðwÞkw2k²: ð13Þ

Let L and R denote the left-hand side and the right-hand side of (13), respectively. Then, by plugging inw2¼2(x1x2þy1y2), it is easy to compute that

L¼ kx₂þy₂k²kw₂k²þ ðx₁þy₁Þ²kw₂k²

4 x²₁kx2k²þx1y1x^T₂y2þx²₁x^T₂y2þx1y1kx2k² kw2k 4 y²₁ky₂k²þx₁y₁x^T₂y₂þy²₁x^T₂y₂þx₁y₁ky₂k²

kw₂k, R¼2ðx²₁þy²₁Þkw2k²þ2ðkx2k²þ ky2k²Þkw2k²

4 2x²₁kx2k²þ2y²₁ky2k²þ4x1y1x^T₂y2

kw2k:

(7)

Using the last two equalities, it then follows that RL¼ ðx₁y₁Þ²kw₂k²þ kx₂y₂k²kw₂k²

4 x²₁kx2k²þy²₁ky2k²þ2x1y1x^T₂y2

kw2k þ4x²₁x^T₂y2þx1y1ky2k²þy²₁x^T₂y2þx1y1kx2k²

kw2k

¼ ðx₁y₁Þ²kw₂k²þ kx₂y₂k²kw₂k²2ðx₁y₁Þðx₂y₂Þ^Tw₂kw₂k

¼ ðx ₁y₁Þw₂ ðx₂y₂Þkw₂k²0:

This implies (13), and consequently the inequality (12) holds fori¼1. Using similar arguments, we can prove that the inequality (12) holds fori¼2. g LEMMA 3.3 For any x¼ ðx1,x2Þ,y¼ ðy1,y2Þ 2IRIRⁿ¹ such that x²þy² 2 intðKⁿÞ,

L_xþyL¹_z

2 2ð ffiffiffiffiffiffiffiffiffiffiffi n1 p

þ2 ffiffiffi p2

Þ,

wherekAk2denotes the Frobenius norm (Euclidean norm)of the matrix A2IRⁿⁿ. Proof Let1,2be the spectral values ofw. Then, by the definition ofz, we have

z₁¼ ffiffiffiffiffi ₂ p þ ffiffiffiffiffi

₁ p

2 , z₂¼

ffiffiffiffiffi ₂ p ffiffiffiffiffi

₁ p

2 w₂ ð14Þ

with w2¼_kw^w²

2k if w26¼0, and otherwise w2 being any vector in IRⁿ¹ satisfying kw2k ¼1.

Ifw2¼0, then1¼2¼w1¼ kxk²þ kyk². From formula (9), it follows that LxþyL¹_z ¼ 1

ffiffiffiffiffiffi w₁

p Lxþy¼ 1 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi kxk²þ kyk²

p Lxþy:

Consequently,

LxþyL¹_z

²

2¼nðx1þy1Þ²þ2kx2þy2k² kxk²þ kyk² 2n, which immediately implies the desired result.

Ifw26¼0, then by applying formula (9), it is not difficult to compute that

LxþyL¹_z ¼

ðx1þy1Þz1 ðx2þy2Þ^Tz2

ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p ðx1þy1Þz^T₂ ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p þðx2þy2Þ^T

z₁ þðx2þy2Þ^Tz2z^T₂ z1 ffiffiffiffiffi

1

p ffiffiffiffiffi 2

p ðx2þy2Þz1 ðx1þy1Þz2

ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p ðx2þy2Þz^T₂ ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p þðx1þy1ÞI

z₁ þðx1þy1Þz2z^T₂ z1 ffiffiffiffiffi

1

p ffiffiffiffiffi 2

p 2

66 64

3 77 75

:¼ b1ðx,yÞ b2ðx,yÞ^T c2ðx,yÞ B2ðx,yÞ

" #

:

(8)

Substituting the expressions ofz1,z2in (14) into the entries of the above matrix, we get

b1ðx,yÞ ¼ðx1þy1Þ þ ðx2þy2Þ^Tw2

2 ffiffiffiffiffi 2

p þðx1þy1Þ ðx2þy2Þ^Tw2

2 ffiffiffiffiffi 1

p ,

c2ðx,yÞ ¼ðx2þy2Þ þ ðx1þy1Þw2

2 ffiffiffiffiffi ₂

p þðx2þy2Þ ðx1þy1Þw2

2 ffiffiffiffiffi ₁

p ,

b2ðx,yÞ ¼1½ðx1þy1Þ þ ðx2þy2Þ^Tw2w2

2 ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p ð ffiffiffiffiffi 1

p þ ffiffiffiffiffi 2

p Þ 2½ðx1þy1Þ ðx2þy2Þ^Tw2w2

2 ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p Þ þ2ðx₂þy₂Þ ðx₂þy₂Þ^Tw₂w₂

ffiffiffiffiffi ₁ p þ ffiffiffiffiffi

₂

p ,

B2ðx,yÞ ¼1½ðx2þy2Þ þ ðx1þy1Þw2w^T₂ 2 ffiffiffiffiffi

1

p ffiffiffiffiffi 2

p Þ 2½ðx2þy2Þ ðx1þy1Þw2w^T₂ 2 ffiffiffiffiffi

1

p ffiffiffiffiffi 2

p Þ þ ðx₁þy₁Þ

ffiffiffiffiffi ₁ p þ ffiffiffiffiffi

₂

p ð2Iw2w^T₂Þ:

Now, using Lemma 3.2, we can verify that the following inequalities hold:

ðx1þy1Þ þ ðx2þy2Þ^Tw2

2 ffiffiffiffiffi ₂ p

ðx2þy2Þ þ ðx1þy1Þw2

2 ffiffiffiffiffi ₂ p

1

ffiffiffi2 p , ðx₁þy₁Þ ðx₂þy₂Þ^Tw₂

2 ffiffiffiffiffi ₁ p

ðx₂þy₂Þ ðx₁þy₁Þw₂ 2 ffiffiffiffiffi

₁ p

1

ffiffiffi2 p , and

1½ðx1þy1Þ þ ðx2þy2Þ^Tw2w2

2 ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p Þ 2½ðx1þy1Þ ðx2þy2Þ^Tw2w2

2 ffiffiffiffiffi 1

p ffiffiffiffiffi 2

p Þ

ffiffiffi

2 p

, ₁½ðx₂þy₂Þ þ ðx₁þy₁Þw₂w^T₂

2 ffiffiffiffiffi ₁ p ffiffiffiffiffi

₂ p ð ffiffiffiffiffi

₁ p þ ffiffiffiffiffi

₂

p Þ ₂½ðx₂þy₂Þ ðx₁þy₁Þw₂w^T₂ 2 ffiffiffiffiffi

₁ p ffiffiffiffiffi

₂ p ð ffiffiffiffiffi

₁ p þ ffiffiffiffiffi

₂ p Þ

2

ffiffiffi 2 p

: This together withjx1þy1j ffiffiffiffiffi

1

p andkx2þy2k ffiffiffiffiffi 1

p implies that jb1ðx,yÞj kc2ðx,yÞk ffiffiffi

p2

, kb2ðx,yÞk ffiffiffi p2

þ3, kB2ðx,yÞk₂2 ffiffiffiffiffiffiffiffiffiffiffi n1 p

þ1þ ffiffiffi p2

: Consequently,L_xþyL¹_z

22 ffiffiffiffiffiffiffiffiffiffiffi n1 p

þ4 ffiffiffi p2

. The proof is thus completed. g It should be pointed out that using Lemmas 3–4 of [5] we may also get a upper bound forkLxþyL¹_z k₂, but such a upper bound is not tighter than the one given here.

By using Lemma 3.3, we can further obtain the following result. Its proof is simple, however, as will be shown below, this result is a key to establish Proposition 3.2 (b).

LEMMA 3.4 For any given x,y2IRⁿ such that x²þy²2intðKⁿÞ, let A:¼ L_2zðxþyÞL¹_z and p_AðtÞ ¼tⁿþa₁ðx,yÞtⁿ¹þ þa_n1ðx,yÞtþa_nðx,yÞ be its characteristic polynomial. Then, there exists a constant c1(n)41 dependent on n such that

kAⁿ¹þa₁ðx,yÞAⁿ²þ þa_n1ðx,yÞAk₂c₁ðnÞ: ð15Þ

(9)

Proof For any givenx,y2IRⁿsuch thatx²þy²2intðKⁿÞ, sinceA¼2ILxþyL¹_z , applying Lemma 3.3 yields

kAk₂2ð ffiffiffi pn

þ ffiffiffiffiffiffiffiffiffiffiffi n1 p

þ2 ffiffiffi p2

Þ: ð16Þ

Let c2ðnÞ:¼2ð ffiffiffi pn

þ2 ffiffiffi p2

Þ. Then, from the inequality (3.1.11) of [14], we have

jiðAÞj c2ðnÞ, i¼1, 2,. . .,n,

where1(A),. . .,n(A) are the eigenvalues ofAincluding multiplicities. Sinceak(x,y) is the sum of all _kⁿ k-fold products of distinct items from1(A),. . .,n(A), i.e.

akðx,yÞ ¼ X

1i155ikn

Y^k

j¼1

ijðAÞ, k¼1, 2,. . .,n,

there exists a positive constantc3(n) only dependent on the dimensionnsuch that jakðx,yÞj c3ðnÞ, k¼1, 2,. . .,n: ð17Þ Combining Equations (16) and (17), we immediately obtain (15) with

c₁ðnÞ:¼max 1, c₂ðnÞⁿ¹þc₃ðnÞc₂ðnÞⁿ²þ þc₃ðnÞc₂ðnÞ ,

and consequently the desired result follows. g

Now we are in a position to present the three crucial properties of r_x _FB and ry FB.

PROPOSITION 3.2 The gradientsr_x _FBandr_y _FBof _FBhave the following properties:

(a) kr_x _FBðx,yÞ þ ry FBðx,yÞk 2ð ffiffiffi pn

þ ffiffiffiffiffiffiffiffiffiffiffi n1 p þ2 ffiffiffi

p2

Þk_FBðx,yÞk for all x,y2IRⁿ;

(b) rx FBðx,yÞ þ ry FBðx,yÞ^ð32 ﬃﬃ2 p

Þⁿ

2ⁿc1ðnÞ k_FBðx,yÞk for all x,y2IRⁿ, where c₁ðnÞis the constant from Lemma3.4.

(c) kr_x _FBðx,yÞ þ r_y _FBðx,yÞk ¼0 if and only if x2 K, y2 K, hx,yi ¼0.

Proof (a) We prove the result by the following three cases:

Case1 (x,y)¼(0,0). In this case, the result is clear by Lemma 3.1 andFB(0,0)¼0.

Case2 x²þy²2intðKⁿÞ. Using Lemmas 3.1 and 3.3, it follows that r_x _FBðx,yÞ þ r_y _FBðx,yÞ

¼ ð2I L_xþyL¹_z Þ_FBðx,yÞ k2ILxþyL¹_z k₂k_FBðx,yÞk 2ð ffiffiffi

pn

þ2 ffiffiffi 2 p

Þk_FBðx,yÞk: ð18Þ Case3 x²þy²2=intðKⁿÞand (x,y)6¼(0, 0). From Lemma 3.1 we have that

r_x _FBðx,yÞ þ ry FBðx,yÞ

¼ x1þy1

q 2

!

_FBðx,yÞ

¼ 2x₁þy₁

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x²₁þy²₁ q

!

k_FBðx,yÞk

k_FBðx,yÞk, ð19Þ

(10)

where the second equality is due toffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðx1þy1Þ²2ðx²₁þy²₁Þ, and the inequality is since

x²₁þy²₁

q x₁þy₁ by the non-negativity ofx1,y1.

(b) Similar to part (a), we also proceed the proof by the three cases.

Case1 (x,y)¼(0,0). The result is clear by Lemma 3.1 andFB(0,0)¼0.

Case2 x²þy²2intðKⁿÞ. In this case, from Lemma 3.1 it follows that rx FBðx,yÞ þ ry FBðx,yÞ

¼L2zðxþyÞL¹_z _FBðx,yÞ:

Notice that z_Kn 0 and 4z² ðxþyÞ²¼2z²þ ðxyÞ²z_Kn 0. From [10, Proposition 3.4] we have 2z ðxþyÞ _Kn 0, which by Lemma 2.1 (c) implies L2z(xþy)O. Consequently,

r_x _FBðx,yÞ þ ry FBðx,yÞ

¼ k_FBðx,yÞk

L2zðxþyÞL¹_z

1

2

¼ k_FBðx,yÞk LzL¹_2zðxþyÞ

2

: ð20Þ

We next prove that all eigenvalues of L_zL¹_2zðxþyÞ are bounded. Since L2z(xþy)OandL2z(xþy)þLzO, setting B¼L2z(xþy),C¼Lzand applying Lemma 2.2 then yields that every eigenvalue ofCB¹is greater than –1, i.e.

i LzL¹_2zðxþyÞ

4 1, i¼1, 2,. . .,n: ð21Þ On the other hand, since z_Kn 0 and 2z² ðxþyÞ²¼ ðxyÞ²x_Kn0, we have from [10, Proposition 3.4] that ffiffiffi

p2

z ðxþyÞxKn y ffiffiffi p2

z jxþyjxKn0.

Consequently,

½2z ðxþyÞ 3=2 ffiffiffi p2

z¼ ð1=2Þzþ ffiffiffi p2

z ðxþyÞ Kn 0:

This in turn implies L2zðxþyÞL_ð3=2^pﬃﬃ₂

ÞzO. Setting B¼L2zðxþyÞ,C¼ L_ð3=2 ﬃﬃ

2 p

Þzand applying Lemma 2.2 again, we have _i L_ð3=2 ﬃﬃ

2 p

ÞzL¹_2zðxþyÞ

41, i¼1, 2,. . .,n, and therefore,

iLzL¹_2zðxþyÞ

5 2 32 ffiffiffi

p2, i¼1, 2,. . .,n: ð22Þ Combining (21) and (22) shows that all eigenvalues of L_zL¹_2zðxþyÞ are bounded and

iLzL¹_2zðxþyÞ

5 2

32 ffiffiffi

p2, i¼1, 2,. . .,n: ð23Þ Now letA¼L2zðxþyÞL¹_z andpA(t) be the characteristic polynomial ofAdefined as in Lemma 3.4. Then, using the fact thatpA(A)¼0, we obtain

Aⁿ¹þa1ðx,yÞAⁿ²þ þan1ðx,yÞ þanðx,yÞA¹ ¼0,

(11)

which in turn implies that A¹¼ 1

anðx,yÞ Aⁿ¹þa1ðx,yÞAⁿ²þ þan1ðx,yÞ

¼ 1

1ðAÞ nðAÞ Aⁿ¹þa₁ðx,yÞAⁿ²þ þa_n1ðx,yÞ

¼ ₁ðA¹Þ _nðA¹Þ Aⁿ¹þa₁ðx,yÞAⁿ²þ þa_n1ðx,yÞ

: ð24Þ Note thatA¹is preciselyLzL¹_2zðxþyÞ. Hence, from (23) to (24) and Lemma 3.4, we have

kL_zL¹_2zðxþyÞk₂¼ kA¹k₂ 2 32 ffiffiffi

p2

n

c₁ðnÞ:

This together with (20) yields the desired result.

Case3 ðx,yÞ2=intðKⁿÞand (x,y)6¼(0,0). Using (19) andjx1þy1j p^ffiffiffi2 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi x²₁þy²₁

q ,

r_x _FBðx,yÞ þ r_y _FBðx,yÞ

ð2 ffiffiffi

2 p

Þk_FBðx,yÞk:

Noting that 2 ffiffiffi p2

^ð32^pﬃﬃ₂_Þn

2ⁿc1ðnÞ sincec1(n)41, the desired result follows.

(c) This is direct by using parts (a)–(b) and the equivalence (6). g In what follows, we establish the coerciveness of the functionFBunder some mild assumptions of F. For this purpose, we assume that K is given by (2), and corresponding to the Cartesian structure of K, write ¼(1,. . .,m) and F¼ (F1,. . .,Fm) withi2IRⁿⁱandFi:IRⁿ!IRⁿⁱ. The following lemma and assumptions will be needed.

LEMMA 3.5 [20 Lemma 5.2] Let FB be defined by (5). For any sequence fðx^k,y^kÞg IRⁿIRⁿ, let ₁^k₂^k and ₁^k₂^k denote the spectral values of x^k andy^k,respectively.

(a) Iff₁^kg ! 1or f₁^kg ! 1,then {kFB(x^k,y^k)k}! 1.

(b) If f₁^kg and f₁^kg are bounded below, but f₂^kg,f₂^kg ! þ1 and

x^k kx^kk_k^y_y^kkk

n o

6!0,thenfk_FBðx^k,y^kÞkg ! 1.

ASSUMPTION 3.1 For any sequence f^kg IRⁿ satisfying lim_k!1k^kk ¼ 1,if there exists2{1,. . .,m}such that the sequencesf₁ð^kÞg,f₁ðFð^kÞÞgare bounded below, butf2ð^kÞg,f2ðFð^kÞÞg ! 1,then there holds that

^k

k^kk Fð^kÞ

kFð^kÞk6!0 as k! 1: ð25Þ ASSUMPTION 3.2 There exist40 and r2(0,1]such that the mapping F satisfies

kFðÞk kFð0Þk þkk^r for any2IRⁿ:

PROPOSITION 3.3 Let FBbe given by (7)and F¼(F1,. . .,Fm)with Fi:IRⁿ!IRⁿⁱ. Then, the functionFBis coercive under one of the following conditions:

(a) F has the uniform Jordan P-property and Assumption3.1holds;

(b) F has the uniform Jordan P-property and Assumption3.2holds.

(12)

Proof The proof is by contradiction. Assume that a sequence {^k} exists such that limk!1 k^kk ¼ 1 and the sequence {FB(^k)} is bounded. Corresponding to the structure ofK, for eachk we write^k¼ ð₁^k,. . .,_m^kÞwith _i^k2IRⁿⁱ. Define the index set

J:¼i2 f1, 2,. . .,mg j f_i^kgis unbounded :

Clearly, J6¼ ; since {^k} is unbounded. Let {^k} be a bounded sequence with ^k¼ ð₁^k,. . .,_m^kÞand_i^k2IRⁿⁱ for i¼1, 2,. . .,m, where _i^k for eachkis defined as follows:

_i^k ¼ 0 ifi2J, _i^k otherwise:

(a) From the uniform JordanP-property ofF, there exists 40 such that k^k^kk² max

i¼1,...,m2 ð_i^k_i^kÞ ðFið^kÞ Fið^kÞÞ

¼max

i2J ₂ _i^k ðF_ið^kÞ F_ið^kÞÞ

¼₂ ^k ðFð^kÞ Fð^kÞÞ k^k ðFð^kÞ Fð^kÞÞk ffiffiffi

2 p

k^kkkFð^kÞ Fð^kÞk, ð26Þ whereis one of the indices for which the maximum is attained and which we have, without loss of generality, assumed to be independent ofk, and the last inequality is easily shown by (8). Since 2J, we assume without loss of generality that fk^kkg ! 1. Since k^k^kk² k^k^kk²¼ k^kk², dividing the both sides of (26) byk^kkthen yields

k^kk ffiffiffi p2

kFð^kÞ Fð^kÞk ffiffiffi p2

kFð^kÞk þ kFð^kÞk

:

This, together with the boundedness of {F(^k)}, impliesfkFð^kÞkg ! 1. Thus, fk^kkg ! 1 and fkFð^kÞkg ! 1: ð27Þ Now if f₁ð^kÞg ! 1 or f₁ðFð^kÞÞg ! 1, then using Lemma 3.5 (a) readily yieldsfk_FBð^k,Fð^kÞÞkg ! 1and hence {FB(^k)}! 1, which gives a contradiction to the boundedness of {FB(^k)}. Otherwise, from (27) we havef₂ð^kÞg ! 1 and {2(F(^k))}! 1. By the given assumption, condition (25) holds. Then, {^k} satisfies Lemma 3.5 (b), which in turn implies {_FB(^k)}! 1. This is clearly impossible.

(b) From the above discussions, Equations (26)–(27) still hold for this case.

If f1ð^kÞg ! 1 or {2(F(^k))}! 1, then from part (a) it is impossible.

Otherwise, from (27) we have f2ð^kÞg ! 1 and {2(F(^k))}! 1. We next show that_k^kk

k_kF^F^ð^k^Þ

ð^kÞk6!0 ask! 1. If not, by the continuity of2() and Equation (27),

k!1lim

₂ ^k ðFð^kÞ Fð^kÞÞ k^kkkFð^kÞk lim

k!1₂ ^k

k^kk Fð^kÞ kFð^kÞk

þ lim

k!12

^kFð^kÞ k^kkkFð^kÞk

¼0, ð28Þ

(13)

where the inequality is easily shown by (8) and the equality is due to the boundedness of {F(^k)}. On the other hand, from Assumption 3.2, there exist 40 andr2(0,1]

such thatkFð^kÞk kFð^kÞk kFð0Þk þk^kk^rfor eachk, and hence,

k!1lim

k^k^kk² k^kkkFð^kÞk lim

k!1

k^k^kk² k^kkðkFð0Þk þk^kk^rÞ

40:

This together with the first inequality of (26) yields a contradiction to (28). Thus, we verify that the sequencesf^kgandfFð^kÞgsatisfy the conditions of Lemma 3.5 (b).

Consequently, we have {FB(^k)}! 1. This is clearly impossible. g Since the uniform CartesianP-property implies the uniform JordanP-property, the condition of Proposition 3.3 (a) is weaker than that of Proposition 5.2 in [20].

We also see that Assumption 3.2 is weaker than the Lipschitz continuity ofF. WhenK reduces to the non-negative orthant cone IRⁿ_þand the Jordan product ‘’ becomes the component wise product of the vectors, since Assumption 3.1 automatically holds and the uniform Jordan P-property of F is equivalent to saying that F is a uniform P-function, we readily recover the result of [7, Theorem 4.2] from Proposition 3.3 (a).

4. A descent method and global convergence

In this section, we propose a derivative-free descent algorithm based on the minimization reformulation (7). The algorithm will make use of the vector of the following form:

dð,Þ:¼ rx FBð,FðÞÞ ð1Þry FBð,FðÞÞ ð29Þ as the search direction, where 2[0,1) is a parameter. Note that d(,) for any 2[0,1) may not be a descent direction ofFBat. But, the following lemma states that, whenFis monotone, there always existsðÞ 2 ð0, 1 such thatd(,) for any 2 ½0,ðÞÞ is a descent direction. The idea for constructing such a direction is borrowed from [23].

LEMMA 4.1 Suppose that F is monotone. If is not a solution of the SOCCP, then there existsðÞ 2 ð0, 1 such thatrFB()^Td(,)50 for all2 ½0,ðÞÞ.

Proof Since F is continuously differentiable, the function FB() is also continuously differentiable by Lemma 3.1. Using the chain rule, the gradient of FBat is

r_FBðÞ ¼ rx FBð,FðÞÞ þ rFðÞry FBð,FðÞÞ: ð30Þ This together with the definition ofd(,) yields that

r_FBðÞ^Tdð,Þ ¼ krx FBð,FðÞÞk²r_x _FBð,FðÞÞ,rFðÞry FBð,FðÞÞ ð1Þ r _x _FBð,FðÞÞ,r_y _FBð,FðÞÞ

ð1Þ r y FBð,FðÞÞ,rFðÞry FBð,FðÞÞ

: ð31Þ Let

qðÞ:¼ krx FBð,FðÞÞk² r _x _FBð,FðÞÞ,rFðÞry FBð,FðÞÞ

(14)

and

pðÞ:¼ r _x _FBð,FðÞÞ,r_y _FBð,FðÞÞ

r _y _FBð,FðÞÞ,rFðÞr_y _FBð,FðÞÞ : Then, (31) can be rewritten as

r_FBðÞ^Tdð,Þ ¼ ð1ÞpðÞ þqðÞ:

Note that the first term ofp() is negative by Proposition 3.1 (a) since is not a solution of the SOCCP, whereas the second term is non-positive sinceFis monotone.

Therefore, we havep()50. LetðÞ be defined as follows:

ðÞ :¼

pðÞ

qðÞ pðÞ ifqðÞ4pðÞand pðÞ qðÞ pðÞ1;

1 otherwise:

8<

:

We see that for all2 ½0,ðÞÞ, the search direction d(,) defined by (29) satisfies the descent conditionrFB()^Td(,)50. The proof is thus completed. g Lemma 4.1 motivates us to propose the following descent algorithm withd(,).

Algorithm 4.1

Step0. Choose⁰2IRⁿ, 0, 2 ð0, 1=2Þand,2(0,1) with4. Setk:¼0.

Step1. IfFB(^k), then stop and^kis an approximate solution of the SOCCP.

Step2. Let lkbe the smallest non-negative integerlsatisfying _FBð^kþ^ldð^k,^lÞÞ _FBð^kÞ

2lkr_x _FBð^k,Fð^kÞÞ þ r_y _FBð^k,Fð^kÞÞk², ð32Þ whered(,) is defined as in (29), and set

d^kð^l^kÞ:¼dð^k,^l^kÞ and ^kþ1:¼^kþ^l^kd^kð^l^kÞ:

Step3. Let k:¼kþ1, and then go to Step 1.

Algorithm 4.1 is similar to the one proposed in [23] for the NCP with a regularized FB merit function. Since there is no need to compute the gradient ofFB

and the Jacobian ofF(), Algorithm 4.1 is suitable for large-scale problems, as well as applications where the Jacobians ofF() are not available or are costly to compute. In addition, the stepsize and the search direction are adjusted during the backtracking search of Armijo-type, which may be regarded as a kind of curvilinear search.

In what follows, we analyse the global convergence of Algorithm 4.1. Without loss of generality, we assume that¼0. We first show that under the monotonicity of Fevery accumulation point of the sequence {^k} is a solution of the SOCCP.

THEOREM 4.1 Suppose that F is monotone. Then, Algorithm 4.1 is well-defined for any initial point ⁰.Furthermore, if* is an accumulation point of the sequence{^k} generated by Algorithm4.1,then*is a solution of the SOCCP.

Proof The proofs are similar to those of [23, Theorem 4.1]. We first show that, whenever ^k is not a solution, there exists a non-negative integer lk in Step 3 of

(15)

Algorithm 4.1 such that (32) holds. Suppose not, then for any positive integerl, we have

_FBð^kþ^ldð^k,^lÞÞ _FBð^kÞ4 ^2lkr_x_FBð^k,Fð^kÞÞ þ r_y_FBð^k,Fð^kÞÞk²: Dividing the above inequality by^land passing to the limit l! 1, we get

l!1lim

_FBð^kþ^ldð^k,^lÞÞ _FBð^kÞ

^l 0: ð33Þ On the other hand, using the mean-value theorem, it follows that

_FBð^kþ^ldð^k,^lÞÞ _FBð^kþ^ldð^k, 0ÞÞ

¼^lr_FB^kþ^ldð^k, 0Þ þt^lðdð^k,^lÞ dð^k, 0ÞÞT

dð^k,^lÞ dð^k, 0Þ

¼^l^lr_FB^kþ^ldð^k, 0Þ þt^l^lhð^kÞT

hð^kÞ,

where t is a constant such that t2(0, 1) and hð^kÞ:¼ r_y _FBð^k,Fð^kÞÞ r_x _FBð^k,Fð^kÞÞ. From this and the continuity ofrFB, we immediately obtain

l!1lim

_FBð^kþ^ldð^k,^lÞÞ _FBð^kþ^ldð^k, 0ÞÞ ^l ¼0:

Consequently,

l!1lim

_FBð^kþ^ldð^k,^lÞÞ _FBð^kÞ ^l

¼ lim

l!1

_FBð^kþ^ldð^k,^lÞÞ _FBð^kþ^ldð^k, 0ÞÞ ^l

þlim

l!1

_FBð^kþ^ldð^k, 0ÞÞ _FBð^kÞ ^l

¼ r_FBð^kÞ^Tdð^k, 0Þ: ð34Þ

Combining (34) with (33) then yieldsr_FB (^k)^Td(^k,0)0. This gives a contradiction, since, by Lemma 4.1,d(^k,0) must be a descent direction of FBat ^kif^kis not a solution of the SOCCP. Thus, Algorithm 4.1 is well defined.

Next, we prove that any accumulation point * of {^k} is a solution of the SOCCP. Let {^k}k2K} be a subsequence converging to *. From the definition of d(,), we see thatd(,) is continuous, which implies thatd^kð^l^kÞ ¼dð^k,^l^kÞ !d ask(2K)! 1. SinceFB(^k) decreases at each iteration, the right-hand side of (32) tends to 0. We next proceed the discussions by two cases: {lk}k2Kis bounded and {lk}k2Kis unbounded.

Case1 {lk}k2Kis bounded. In this case,f^l^kg_k2Kdoes not approach 0. Consequently, kr_x _FBð,FðÞÞ þ r_y _FBð,FðÞÞk² ¼0:

From Proposition 3.2 (c), it then follows that * is a solution of the SOCCP.