A logarithm barrier method for semi-definite programming

(1)

DOI:10.1051/ro:2008005 www.rairo-ro.org

A LOGARITHM BARRIER METHOD FOR SEMI-DEFINITE PROGRAMMING

Jean-Pierre Crouzeix

¹

and Bachir Merikhi

²

Abstract. This paper presents a logarithmic barrier method for solving a semi-deﬁnite linear program. The descent direction is the classical Newton direction. We propose alternative ways to determine the step-size along the direction which are more eﬃcient than classical line- searches.

Keywords. Linear semi-deﬁnite programming, barrier methods, line-search.

Mathematics Subject Classification. 90C22, 90C05, 90C51.

1. Introduction

In this paper we present an algorithm for solving the optimization problem:

m_d= inf

y

b^ty :

m i=1

y_iA_i−C ∈K, y∈R^m

, (D)

whereKdenotes the cone ofn×nsymmetric positive semi-deﬁnite matrices, the vectorb∈R^mand then×nsymmetric matricesCandA_i, i= 1, . . . , m,are given.

The dual problem of (D) is:

m_p= max

X [C, X:X ∈K, A_i, X=b_i∀i= 1, . . . , m], (P)

Received September 01, 2006. Accepted September 01, 2006.

1 LIMOS, Université Blaise Pascal, Campus des Cézaux, 63174 Aubière, France;

[email protected]

2 Laboratoire d’optimisation, Université Ferhat Abbas, Algérie; the research of this author has been made possible thanks to a PROFAS grant and the hospitality of Université Blaise Pascal. b [email protected]

Article published by EDP Sciences c EDP Sciences, ROADEF, SMAI 2008

(2)

J.-P. CROUZEIX AND B. MERIKHI

where byC, Xwe denote the trace of the matrix (C^tX). It is recalled that·,· corresponds to an inner product on the space ofn×nmatrices.

These problems are linear. Their feasible sets involving the cone of positive semi-deﬁnite matrices, a non polyhedral convex cone, they are called linear semi- deﬁnite programs. Such problems are the object of a particular attention since the papers by Alizadeh [1,2], as well on a theoretical or an algorithmical aspect, see for instance the following references [1–4,6,7].

Under suitable conditions, solving (D) is equivalent to solving (P): the optimal solutions of one problem being easily obtained when one optimal solution of the other problem is known. In this paper, the problem (D) is approximated by the problem (D_r), (r >0),

m(r) = inf [f_r(y) : y∈R^m], (D_r) where the barrier functionf_r:R^m→(−∞,+∞] is deﬁned by

f_r(y) =

b^ty+nrlnr−rln[det(_m

i=1y_iA_i−C)] if y∈Y ,

+∞ if not,

with

Y =

y∈R^m: the matrix m i=1

y_iA_i−C∈K

,

and K = int(K) is the cone of n×n symmetric positive definite matrices. This problem is solved via a classical Newton descent method. The difficulty is in the line-search: the presence of a determinant in the definition of f_r induces high computational costs in classical exact or approximate line-searches. Here, instead of minimizingf_ralong the descent directiondat the current pointx, we minimize a functionθ such that

1

r[f_r(x+td)−f_r(x)] =θ(t)≤θ(t) ∀t >0, θ(0) =θ(0), θ(0) =θ(0)<0.

This functionθ needs to be appropriately chosen so that the optimalt is easily obtained and to be close enough toθ in order to give a signiﬁcant decrease off_r in the iteration step. We propose in this paper functionsθfor which the optimal solutiontis explicitly obtained and a good quality of the approximation ofθbyθ is ensured by the conditionθ(0) =θ(0).

In the next section, we briefly recall some results in linear semi-definite programming. Section 3 studies the problem (D_r), in particular the behavior of its optimal value and its optimal solutions when r → 0. Section 4 shows how to compute the Newton descent direction. Section 6 is devoted to the determination of efficient approximationsθ, these approximations are deduced from inequalities shown in Section 5. The algorithm is resumed in Section 7 and numerical experiments presented in Section 8 show the efficiency of the approximations when compared with classical line-searches.

(3)

2. A brief background in linear semi-definite programming

Throughout the paper, we use the following notation:

Y ={y∈R^m:_m

i=1y_iA_i−C∈K}, F={X∈K:Ai, X=b_i∀i}, Y ={y∈R^m:_m

i=1y_iA_i−C∈K}, F={X∈F : X∈K}.

It is easily seen that −∞ ≤ m_p ≤m_d ≤ +∞ (weak duality). In this paper we assume that the two following assumptions hold:

(H1) The system of equations Ai, X=b_i,i= 1, ..., mis of rankm.

(H2) The setsY andF are non empty.

Then it is known that (see for instance [1,3]):

(a) −∞< m_p =m_d <+∞.

(b) The sets of optimal solutions of (P) and (D) are non empty convex compact sets.

(c) If ¯X is an optimal solution of (P), then ¯y is an optimal solution of (D) if and only if

¯

y∈Y and _m

i=1

¯ y_iA_i−C

X¯ = 0.

(d) If ¯y is an optimal solution of (D), then ¯X is an optimal solution of (P) if and only if

X¯ ∈F and

_m

i=1

¯ y_iA_i−C

X¯ = 0.

3. The problem (D

r

) : theoretical aspects

Recall that (D_r),r >0, is the problem

m(r) = inf [f_r(y) : y∈R^m], (D_r) withf_r:R^m→(−∞,+∞] deﬁned by

f_r(y) =

b^ty+nrlnr−rln [det(_m

i=1y_iA_i−C)] if y∈Y ,

+∞ if not.

We start with the study of this function.

(4)

3.1. f_r is a twice differentiable strictly convex function

The following notation will be used in the expressions of the gradient and the Hessian of f_r: giveny ∈Y, we introduce the m×m symmetric positive deﬁnite matrixB(y) and the lower triangularm×mmatrixL(y) such that

B(y) = m

i=1

y_iA_i−C=L(y)L^t(y).

Next, fori, j= 1,2,· · ·, m, we deﬁne

A_i(y) = [L(y)]⁻¹A_i[L^t(y)]⁻¹, b_i(y) = trace(A_i(y)) = trace(A_iB⁻¹(y)),

∆_ij(y) = trace(B⁻¹(y)A_iB⁻¹(y)A_j) = trace(A_i(y)A_j(y)).

Thusb(y) is a vector of R^mand ∆(y) is a symmetricm×mmatrix

Theorem 1. The functionf_ris twice continuously diﬀerentiable on Y. Actually, for ally∈Y we have:

(a)∇fr(y) =b−rb(y); (b) ∇²f_r(y) =r∆(y);

(c) the matrix ∆(y)is deﬁnite positive.

Proof. (a) Denote by (e₁, e₂,· · · , e_m) the canonical basis ofR^m. Leti∈ {1,· · ·, m} andz_i∈R, z_i= 0. Then,

f_r(y+z_ie_i)−f_r(y)

z_i = b_i− r

z_i[ln det(B(y+z_ie_i))−ln det(B(y))],

= b_i− r

z_i[ln det(L(y)[I+z_iA_i]L^t(y))−ln det(B(y))],

= b_i− r

z_iln det(I+z_iA_i(y)),

= b_i− r

z_iln[1 +z_itrace(A_i(y)) +z_iε(z_i)]

where the functionεis such that ε(z)→0 when z →0. Pass to the limit when z_i→0.

(b) In the same manner, giveni, j∈ {1,· · ·, m}, let us consider b_i(y+z_je_j)−b_i(y)

z_j =−1

z_j [trace(A_i[B⁻¹(y+z_je_j)−B⁻¹(y)])].

But,

B⁻¹(y+z_je_j)−B⁻¹(y) = [B(y) +z_jA_j]⁻¹−B⁻¹(y),

= [B(y)(I+z_jB⁻¹(y)A_j) ]⁻¹−B⁻¹(y),

= [(I+z_jB⁻¹(y)A_j)⁻¹−I]B⁻¹(y).

(5)

Neglecting the second order terms inz_j, we obtain b_i(y+z_je_j)−b_i(y)

z_j ∼trace(A_iB⁻¹(y)A_jB⁻¹(y)).

Pass to the limit whenz_j →0. On the other hand the equality trace(B⁻¹(y)A_iB⁻¹(y)A_j) = trace(A_i(y)A_j(y)) is immediate.

(c) Letd = 0. Next, letM =_m

i=1d_iA_i(y). Then (H1) impliesM = 0. On the other hand,

∇²f_r(y)d, d=rtrace

⎛

⎝

i,j

d_id_jA_i(y)A_j(y)

⎞

⎠=rtrace(M²)>0,

from what we deduce that the matrix∇²f_r(y) is positive deﬁnite.

Since f_r is strictly convex,(D_r)has at most one optimal solution.

3.2. (D_r)has one unique optimal solution

Because the convex function f_r takes the value +∞ on the boundary of its domain and is diﬀerentiable on the interior, it is lower semi-continuous. In order to prove that (D_r) has one optimal solution, it suﬃces to prove that the recession cone off_ris reduced to the origin. Before that, we show the following result:

Proposition 1. d= 0 wheneverb^td≤0 and_m

i=1d_iA_i∈K. Proof. Assume thatd = 0,b^td≤0 andC =_m

i=1d_iA_i ∈K. Then (H1) implies C = 0. Let some X ∈F⊂K, such X exists in view of assumption (H2). Then,

0<C,X = m i=1

d_iAi,X=b^td.

The proposition is proved.

Theorem 2. d= 0 if(f_r)_∞(d)≤0.

Proof. Fix somey ∈Y, suchy exists in view of assumption (H2). The recession function (f_r)_∞ off_ris deﬁned as

(f_r)_∞(d) = lim

t→+∞

ξ(t) =f_r(y+td)−f_r(y) t

.

LetB =B(y) =_m

i=1y_iA_i−C, B is a positive deﬁnite symmetric matrix, there exists a non singular lower triangular matrix Lsuch thatB =LL^t. Givend, set

(6)

H(d) =_m

i=1d_iA_i. Then, for any t such that the matrix B+tH(d) is positive deﬁnite,

ξ(t) = b^td−rt⁻¹[ln det(B+tH(d))−[ln det(B)),

= b^td−rt⁻¹[ln det(I+tE(d)]

where E(d) =L⁻¹H(d)(L⁻¹)^t. We deduce that , ξ(t) =

b^td−rt⁻¹ln det (I+tE(d)) ifI+tE(d)∈K,

+∞ otherwise.

The condition [f_r]_∞(d) ≤ 0 is therefore equivalent to say that H(d) is positive semi-deﬁnite (hence E(d) is also positive deﬁnite) and

b^td≤r lim

t→∞

1

t ln det (I+tE(d)) =r lim

t→∞

n i=1

1

tln(1 +tλ_i(d)) = 0,

where by λ_i(d) we denote the eigenvalues of E(d). Pass to the limit and apply

Proposition1.

We denote by y(r) ory_r the unique optimal solution of(D_r).

3.3. When r→0

Next, we turn our interest in the behavior of the optimal value m(r) and the optimal solutiony(r) of (D_r) for r→ 0. For that, let us introduce the function h:R^m×R→(−∞,+∞] deﬁned by

h(y, t) =

⎧⎨

⎩

b^ty−ln det _m

i=1y_iA_i−tC

if ^m

i=1y_iA_i−tC∈K,

+∞ otherwise.

It is easily shown that his convex and lower semi-continuous. Next, consider the functionφ:R^m×R×R→(−∞,+∞] deﬁned by

φ(y, t, r) =

⎧⎨

⎩

rh(r⁻¹y, r⁻¹t) if r >0, h_∞(y, t) if r= 0, +∞ if r <0.

Then,φis also lower semi-continuous and convex, see for instance Rockafellar [8].

Next, deﬁnef :R^m×R→(−∞,+∞] by

f(y, r) =φ(y,1, r)

(7)

f is also convex and lower semi-continuous. By construction,

f(y, r) =

⎧⎨

⎩

f_r(y) if r >0, b^ty if r= 0, y∈Y, +∞ otherwise.

(2)

Deﬁne m : R → (−∞,+∞] by m(r) = inf_y[f(y, r) : y ∈ R^m]. This function is convex. Furthermorem(0) = m_d and m(r) is the optimal value of (D_r) when r >0. It is clear that forr >0

m(r) =f_r(y(r)) =f(y(r), r) and

0 =∇fr(y(r)) =∇yf(y(r), r) =b−rb(y_r).

Theorem 3. The functions mandy are continuously diﬀerentiable on (0,+∞). We have, for allr >0,

r∆(y_r)y(r)−b(y_r) = 0, m(r) =n+nln(r)−ln(det(B(y_r)).

Moreover,

m_d=m(0)≤b^ty(r)≤m_d+nr. (3) Proof. Let ¯r > 0, ∇yf(y(¯r),r) = 0 because¯ y(¯r) is an optimal solution of (D_r_¯).

The function f is twice continuously differentiable on Y×]0,+∞[ and the matrix ∇²_yyf(y(¯r),¯r) is positive definite. Applying the implicit function theorem to the equation 0 =T(y, r) = ∇yf(y, r) at the point (y(¯r),r) we deduce that in a¯ neighborhood of ¯rthe function yis continuously differentiable and

∇²_yyf(y_r, r)y(r)−b(y_r) = 0.

Sincem(r) =f(y(r), r) andy is continuously diﬀerentiable on (0,∞) , mis also continuously diﬀerentiable and

m(r) = f_y(y(r), r)y(r) +f_r(y(r), r)

= n+nln(r)−ln[det(B(y_r)]

becausef_y(y(r), r) = [∇yf(y(r), r)]^t= 0. Next, because the functionmis convex, m(0)≥m(r) + (0−r)m(r)

from what we obtain,

+∞> m_d=m(0)≥b^ty(r)−nr >−∞.

On the other handy_r∈Y ⊂Y and thereforeb^ty(r)≥m_d.

(8)

Let us denote by S_d the set of optimal solutions of (D), we know that this set is closed convex bounded and not empty. The distance of a pointy to the setS_D is deﬁned as usual by

d(y, S_d) = inf

z [y−z : z∈S_d].

The following result concerns the behavior ofy_randm(r) whenr→0.

Theorem 4. Assume that r→0, then d(y_r, S_d)→0 andm(r)→m_d. Proof. Let us consider the multivalued mapS deﬁned onR by

S(r) ={y∈Y : b^ty≤m_d+nr}.

Its graph is closed, S(r) =∅ ifr <0. If r >0,∅ =S(0) =S_d ⊂S(r), y_r∈S(r) andS(r) is a closed convex set. The recession cone of S(r),r >0, coincides with the recession cone ofS(0). HenceS(r) is compact becauseS(r) is so. We deduce that the multivalued mapS is upper semi-continuous (USC) on [0,+∞) because compact-valued with a closed graph. Since y_r∈S(r), thend(y_r, S(0))→0 when r→0.

It remains to prove thatm(r)→m_d=m(0) whenr→0. Since the functionm is convex, it is enough to prove that it is lower semi-continuous at 0. We proceed by contradiction, if not there existλ < m_dand a sequence{rk}of positive numbers converging to 0 such thatm(r_k)< λ. Lety_k=y(r_k). Then there exist ¯y∈S_dand a sub-sequence{ykl}converging to ¯y. Since the functionf is lower semi-continuous onR^m×Randf(¯y,0) =m_d> λ, one has for llarge enough

λ > m(r_k_l) =f(y_k_l, r_k_l)> λ

which is not possible.

4. The Newton descent direction and the line-search

Due to the presence of the barrier function, the problem (D_r) can be considered as unconstrained. This problem will be solved via a classical descent algorithm.

Because the functionf_rtakes the value∞on the boundary ofY, the iterates will stay inY. Thus, the method that we propose is an interior point method.

Assume that our current iterate isy ∈Y. For descent directiondaty, we take the solution of the linear system

[∇²f_r(y)]d=−∇fr(y).

According to Theorem1, the linear system is equivalent to the system

∆(y)d=b(y)−1

rb (1)

(9)

withB(y), b(y) and ∆(y) defined as in Section 3.1. The matrix ∆(y) being definite positive, the linear system (1) can be efficiently solved via a Cholewsky decom- position. Of course, we assume ∇fr(y) = 0 (if not the optimum is reached). It follows thatd = 0.

The next step in the algorithm consists in the choice of ¯t >0 giving a signiﬁcant decrease of the functionf_r on the half line y+td, t >0. Then, the next iterate will be taken equal toy+ ¯td. To do that, we consider the function

θ(t) = 1

r[f_r(y+td)−f_r(y)], y+td∈Y , θ(t) = 1

rb^td−ln det(B(y+td)) + ln det(B(y)).

Since∇²f_r(y)d=−∇fr(y) one has

d^t∇²f_r(y)d=−d^t∇fr(y) =d^tb(y)−rd^tb.

In order to simplify the notation,y anddstaying ﬁxed in the following, we set B=B(y) =

m i=1

y_iA_i−C and H = m

i=1

d_iA_i.

SinceB is symmetric and positive deﬁnite, there exists a lower triangular matrix Lsuch thatB=LL^t. Next, we set

E=L⁻¹H[L⁻¹]^t.

Sinced = 0, assumption (H1) implies H = 0 from what we haveE = 0.

With this notation, for allt >0 such thatI+tE is positive deﬁnite,

θ(t) =t[trace(E)−trace(E²)]−ln det(I+tE). (2) Denote byλ_i the eigenvalues of the symmetric matrixE, then

θ(t) = n i=1

[t(λ_i−λ²_i)−ln(1 +tλ_i)], t∈[0,t) (3) where

t= sup[t : 1 +tλ_i>0 for alli] = sup[t : y+td∈Y]. (4) Observe thatt= +∞ifE is positive semi-deﬁnite and 0<t <+∞if not. It is clear that θis convex on [0,t[, θ(0) = 0 and

0<

λ²_i =θ(0) =−θ(0).

Alsoθ(t)→+∞whent→t. It follows that there exists one uniquet_optsuch that θ(t_opt) = 0,θreaches its minimum in this point.

(10)

Unfortunately, there is no explicit formula givingt_optand solving the equation by iterative methods needs successive computations of the functions θ and θ. These computations have a high numerical cost because the expression ofθ in (2) contains a determinant not easily handled and (3) needs the knowledge of the eigenvalues ofE, a diﬃcult numerical problem. This leads to think of alternative approaches.

OnceE is computed, it is easy to compute the two following quantities trace(E) =

i

e_ii =

i

λ_i and trace(E²) =

i,j

e²_ij =

i

λ²_i.

In Section 6, we take advantage of these data to propose lower bounds oft and functions bounded from below byθ. Before, we look at some useful inequalities on a sample of numbers when the sum of the numbers and the sum of their squares are known.

5. Some useful inequalities

As usual in statistics, given a sample of nreal numbersx₁, x₂, . . . , x_n, we consider their arithmetic mean ¯x and their standard deviationσ_x. These quantities are deﬁned as follows:

¯ x= 1

n

x_i and σ²_x= 1 n

x²_i −x¯²= 1 n

(x_i−x)¯ ².

The following result is due to Wolkowicz-Styan [10], see also Crouzeix-Seeger [5]

for additional results.

Proposition 2.

¯ x−σ_x√

n−1≤min

i x_i≤x¯− σ_x

√n−1,

¯

x+ σ_x

√n−1 ≤max

i x_i≤x¯+σ_x√ n−1.

In the particular case where allx_i are positive, one deduces nln(¯x−σ_x√

n−1)≤ⁿ

i=1

ln(x_i)≤nln(¯x+σ_x√ n−1),

where, by convention, ln(t) =−∞ift≤0. The next result is still better.

Theorem 5. Assume that x_i>0 for i= 1,2,· · ·, n. Then nln(¯x−σ_x√

n−1)≤A≤ n i=1

ln(x_i)≤B≤nln(¯x),

(11)

with

A= (n−1) ln

¯

x+ σ_x

√n−1

+ ln(¯x−σ_x√ n−1), and

B= ln(¯x+σ_x√

n−1) + (n−1) ln

¯

x− σ_x

√n−1

.

Proof. If σ_x= 0, then x_i = ¯xfor alli and the inequalities hold. Assumeσ_x>0.

Let us consider the two following problems where ¯xandσ_x are ﬁxed, A= inf

x

_n

i=1

ln(x_i) : n i=1

(x_i−x) = 0¯ and n i=1

(x_i−x)¯ ²=nσ_x²

,

B= sup

x

_n

i=1

ln(x_i) : n i=1

(x_i−x) = 0¯ and n i=1

(x_i−x)¯ ²=nσ²_x

. The second problem has always optimal solutions, the ﬁrst problem has optimal solutions if ¯x−σ_x√

n−1>0 because Proposition2 and in the other case−∞= A <

ln(x_i).

Apply the ﬁrst order necessary optimality condition: ifxis optimal solution of one problem or the other one, there existαandβ such for alli

(x_i−x)¯ ²−α(x_i−¯x) +β= 0.

Thus each (x_i−¯x) is a root of the equation w²−αw+β = 0.

Denote byaand bthe two roots of this equation. The quantities (x_i−x) divide¯ into two parts, p equal to a, n−p equal to b. From σ_x = 0 we deduce that 1≤p≤n−1 anda =b. Hence,

0 =

i

(x_i−x) =¯ pa+ (n−p)b, and

nσ_x²=

i

(x_i−x)¯ ²=pa²+ (n−p)b². From what we deduce that either

a= ¯x+σ_x

n−p

p and b= ¯x−σ_x p

n−p or

a= ¯x−σ_x

n−p

p and b= ¯x+σ_x p

n−p·

(12)

Denote byh(p) andk(p) the following quantities h(p) = p

nln

¯ x+σ_x

n−p p

+n−p n ln

¯ x−σ_x

p n−p

,

k(p) = p nln

¯ x−σ_x

n−p p

+n−p n ln

¯ x+σ_x

p n−p

· Then,

A

n = min

p=1,···,n−1[min[h(p), k(p)]] and B

n = max

p=1,···,n−1[max[h(p), k(p)]].

Buth(p) =k(n−p) for anyp= 1, ..., n−1 and therefore A

n = min

p=1,···,n−1[h(p)] and B

n = max

p=1,···,n−1[h(p)]. (5) It is interesting to sett(p) =

n−pp and to consider the function

γ(t) = t²

t²+ 1ln[¯x+σ_xt⁻¹] + 1

t²+ 1ln[¯x−σ_xt].

Then,

γ(t) = 2t (1 +t²)²

lnx¯+σ_xt⁻¹

¯ x−σ_xt

− σ_x (1 +t²)

1

¯

x+σ_xt⁻¹ + 1

¯ x−σ_xt

·

The concavity of the functiont→t⁻¹ implies that ln(t+δ)−ln(t)< δ

2 1

t+δ+1 t

∀t, δ >0,

from what we deduce thatγ is negative on (0,∞) and therefore γ is decreasing on this interval. It follows that

A=nγ(n−1)< B=nγ 1

n−1

< nlim

t↓0γ(t) =nln(¯x).

The remaining inequality is a straight consequence of the deﬁnition ofA.

6. Back to the step-size procedure

Let us go back to Equations (3) and (4). We denote by ¯λandσ_λ the arithmetic mean and the standard deviation of theλ_i and byλthe euclidean norm of the

(13)

vectorλ. Then,λ²=n(¯λ²+σ_λ²) =θ(0) =−θ(0) and

θ(t) =ntλ¯−tλ²− n

i=1

ln(1 +tλ_i).

Our problem consists to ﬁnd some ¯t ∈ (0,t) giving a signiﬁcant decrease of the convex functionθ.

We have said that the most natural choice, ¯t=t_opt whereθ(t_opt) = 0, presents numerical complications. It can be thought of a line-search by a method of Armijo- Goldstein-Price type but this line-search needs also several computations of func- tionsθ and θ. Nevertheless, if we decide for such a line-search, it is convenient to know, for lack of the upper-boundt of the domain of θ which is numerically diﬃcult to obtain, a lower-bound oft. Such a bound is issued from Proposition2

t₁= sup [t : 1 +tβ₁>0 ] with β₁= ¯λ−σ_λ√ n−1.

Another boundt₂ is due to the fact that|λ_i| ≤ λ for alli t₂= sup [t : 1 +tβ₂>0 ] with β₂=−λ.

Then, 0 < t₂ ≤ t₁ ≤ t ≤ +∞. As already said, the inequality t ≥ t₁ is a consequence of Proposition 2. To prove that t₁ ≥ t₂ it is enough to prove that λ²≥β₁². This inequality is equivalent to

0≤(n−1)¯λ²+σ²_λ+ 2σ_λ¯λ√

n−1 = (¯λ√

n−1 +σ_λ)².

Another strategy consists in minimizing an upper-approximationθ of θ. To be eﬃcient, this approximation must be simple and close enough to θ. Here we require

0 =θ(0), λ²=θ(0) =−θ(0).

Theorem5provides such an approximation: setx_i= 1 +tλ_i, then ¯x= 1 +tλ¯ and σ_x=tσ_λ. Next, deﬁne

θ₀(t) =γ₀t−(n−1) ln(1 +α₀t)−ln(1 +β₀t), with

γ₀=n¯λ− λ², α₀= ¯λ+ σ_λ

√n−1 and β₀=β₁= ¯λ−σ_λ√ n−1.

(14)

It is clear thatθ₀is convex, its domain is [0,t₀) witht₀=t₁ and θ(t)≤θ₀(t) ∀t≥0, θ₀(0) = 0 and θ₀(0) =−θ₀(0) =λ².

One can also thought of simpler functions thanθ₀ involving only one logarithm.

We consider functions of the following type

θ(t) =γt−δln(1 +βt), t∈[0,t) where in order to fulﬁll the requirements

λ²=δβ²=δβ−γ, t= sup [t : 1 +tβ >0 ].

Such functions are convex.

Of courset≤tis required. In line with the lower-boundst₁andt₂, we consider the two functionsθ₁ andθ₂ corresponding toβ₁ andβ₂. In the following result, we compareθ₀,θ₁ andθ₂ . As in other parts of the paper ln(r) =−∞ifr≤0.

Proposition 3. θ_i, i = 0,1,2, is strictly convex on [0,t_i), θ_i(t) → +∞ when t→t_i. Furthermore, θ(t)≤θ₀(t)≤θ₁(t)≤θ₂(t)≤+∞for allt >0.

Proof. The ﬁrst part is immediate. The inequality θ(t) ≤ θ₀(t) is a straight consequence of Theorem5. Setν(t) =θ₁−θ₀. Becauseβ₀=β₁ andα₀≥β₀ one has fort >0

ν(t) = δ₁β₁²−β²₀

(1 +β₀t)² − (n−1)α²₀

(1 +α₀t)² = (n−1)α²₀

(1 +β₀t)² − (n−1)α²₀ (1 +α₀t)² ≥0.

Becauseν(0) =ν(0) = 0, one deduces thatν(t)≥0 fort >0.

Next, setµ(t) =θ₂−θ₁. Then,µ(0) =µ(0) = 0 and µ(t) =λ²

1

(1 +β₂t)² − 1 (1 +β₁t)²

≥0.

Here againµ(t)≥0 for all t >0.

We deduce that the function θ_i reaches its minimum in one unique value ¯t_i which is the root of the equationθ_i(t) = 0. Fori= 1,2 one has

t¯_i= δ_i γ_i − 1

β_i and θ_i(¯t_i) = λ²

β_i +λ²

β_i² ln(1−β).

In particular,

t¯₂= 1

1 +λ and θ₂(¯t₂) =−λ+ ln(1 +λ).

(15)

The solution ofθ₀(t) = 0 leads to the equationt²−2bt+ct= 0 with

b= 1 2

n γ₀ − 1

α₀ − 1 β₀

and c=− λ² α₀β₀γ₀, whose the two roots aret=b±√

b²−c. For ¯t₀we take the root which belongs to the interval (0,t₀) (there is only one).

Thus, the three values ¯t₀, ¯t₁ and ¯t₂are explicitly computed. It is clear that θ(¯t₂)≤θ₂(¯t₂), θ(¯t₁)≤θ₁(¯t₁)≤θ₁(¯t₂)≤θ₂(¯t₂)

and

θ(¯t₀)≤θ₀(¯t₀)≤θ₀(¯t₁)≤θ₁(¯t₁)≤θ₂(¯t₂).

7. Description of the algorithm

Initialization: One decides for a step-size strategy and we choose the pa- rametersε >0, r >0, ρ >0, σ∈(0,1). We start with somey∈Y. Main step: (a) ComputeB=B(y) andLsuch thatLL^t=B.

(b) Computeg=b−rb(y) andH =r∆(y).

(c) Solve the equationHd=−g. ComputeE, trace(E) and trace(E²).

(d) Compute ¯λand ¯σ_λ.

(e) Obtain ¯tusing the step-size strategy. Take ¯y=y+ ¯td.

(f ) If|b^ty−b^ty|¯ > ρnr, doy= ¯y and go to(a).

(g) Ifnr > ε, doy= ¯y, r=σrand go to(a).

(h) Stop: y¯ is an approximate solution of the problem (D).

As said previously, the optimal solution of problem (D_r) is only one approximate solution of problem (D), more r is close to 0, more the approximation is good.

Unfortunately, more r is close to 0, more (D_r) is badly conditioned. It is the reason why we use in the ﬁrst iterations of the algorithm largerinstead of dealing directly with a value ofrsuch thatnr < ε. The reason for the updating ofris the following: ify(r) is the exact solution of problem (D_r), thenb^ty(r)∈[m_d, m_d+nr], it is wasting time to continue iterations on (D_r) when|b^ty−b^ty| ≤¯ ρnr, with ρ near 1. Forρone can consider for instance the values 0.5,1,2,3. Forσ, we can take for instance the values 0.1,0.25,0.5. We describe now four diﬀerent strategies for the step-size:

• Strategy Ls: A classical line-search of Armijo-Goldstein-Price type.

• Strategy S_i,i= 0,1,2 : ¯t= ¯t_i with ¯t_i deﬁned as in the last section.

(16)

8. Numerical experiments

The computations have been performed on a D 810 station with Delphi 5.

8.1. Example cube

n= 2m,C is then×nidentity matrix,b= (2, ...,2)^t∈R^m and the entries of then×nmatrixA_k, k= 1,· · ·m,are given by:

A_k[i, j] =

⎧⎪

⎪⎪

⎪⎨

⎪⎪

⎩

1 ifi=j=k ori=j=k+m, a² ifi=j=k+ 1 ori=j=k+m+ 1,

−a ifi=k, j=k+ 1 ori=k+m, j=k+m+ 1,

−a ifi=k+ 1, j=k ori=k+m+ 1, j=k+m, 0 otherwise.

a∈Ris given.

Test 1. (m, n) = (50,100) and a = 0. Then, it is known that the vector y= (1, ...,1)^t∈R^mis the optimal solution andy₀= (1.5, ...,1.5)^t∈R^mis feasible.

We take for parameters in the algorithmρ= 1,σ= 0.125,r₀= 0.3,ε= 0.1 and for initial pointy₀ . The following array describes the results.

Strategy Computational time Number of iterations

S₀ 34 s 3

S₁ 34 s 4

S₂ 660 s 25

Ls dvg dvg

dvg means that the algorithm does not terminate within a ﬁnite time.

Test 2. In this test, the data are the same as in the ﬁrst test, except ρ= 2 in place of ρ= 1.

Strategy Computational time Number of iterations

S₀ 33 s 3

S₁ 34 s 4

S₂ 480 s 18

Ls dvg dvg

The results of these two tests show that the strategiesS₂andLsdo not compete withS₀ and S₁ and a= 2 or 5. In the next experiments, we continue only with S₀andS₁.

Test 3. Same data as in test 1 exceptC=−2I in place ofC=I. We start the algorithm with the feasible pointy₀= (0, ...,0)^tand we takeρ= 1.

(17)

Strategy S₀ S₁

a 2 5 2 5

computational time 165 s 80 s 180 s 120 s number of iterations 10 5 12 13 Test 4. Same data as in test 3 exceptρ= 2 in place ofρ= 1.

Strategy S₀ S₁

a 2 5 2 5

computational time 130 s 50 s 135 s 56 s number of iterations 7 3 10 5

References

[1] F. Alizadeh, Interior point methods in semi-definite programming with application to com- binatorial optimization.SIAM J. Optim.5(1995) 13–55.

[2] F. Alizadeh, J.-P. Haberly, and M.-L. Overton, Primal-dual interior-point methods for semi- definite programming, convergence rates, stability and numerical results.SIAM J. Optim.

8(1998) 746–768.

[3] D. Benterki, J.-P. Crouzeix, and B. Merikhi, A numerical implementation of an interior point method for semi-definite programming.Pesquisa Operacional23–1(2003) 49–59.

[4] J.-F. Bonnans, J.-C. Gilbert, C. Lemar´echal, and C. Sagastiz`abal, Numerical optimization, theoretical and practical aspects.Mathematics and Applications27, Springer-Verlag, Berlin (2003).

[5] J.-P. Crouzeix and A. Seeger, New bounds for the extreme values of a finite sample of real numbers.J. Math. Anal. Appl.197(1996) 411–426.

[6] M. Kojima, S. Shindoh, and S. Hara, Interior point methods for the monotone semi-definite linear complementarity problem in symmetric matrices.SIAM J. Optim.7(1997) 86–125.

[7] M. Overton and H. Wolkowicz, Semi-definite programming. Math. program. Serie B77 (1997) 105–109.

[8] R.T. Rockafellar,Convex analysis. Princeton University Press, New Jerzey (1970).

[9] L. Vanderberghe and S. Boyd, Positive definite programming. SIAM Review 38 (1996) 49–95.

[10] H. Wolkowicz and G.-P.-H. Styan, Bounds for eigenvalues using traces.Linear Algebra Appl.

29(1980) 471–506.