• Aucun résultat trouvé

On the elimination of inessential points in the smallest enclosing ball problem

N/A
N/A
Protected

Academic year: 2021

Partager "On the elimination of inessential points in the smallest enclosing ball problem"

Copied!
24
0
0

Texte intégral

(1)

HAL Id: hal-01863587

https://hal.archives-ouvertes.fr/hal-01863587

Submitted on 28 Aug 2018

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

To cite this version:

Luc Pronzato. On the elimination of inessential points in the smallest enclosing ball prob-lem. Optimization Methods and Software, Taylor & Francis, 2019, 34 (2), pp.225-247. �10.1080/10556788.2017.1359266�. �hal-01863587�

(2)

Vol. 00, No. 00, Month 20XX, 1–23

On the elimination of inessential points in the smallest enclosing ball problem

L. Pronzatoa∗

aUniversit´e Cˆote d’Azur, CNRS, Laboratoire I3S, Sophia Antipolis

(Received 00 Month 20XX; final version received 00 Month 20XX)

We consider the construction of the smallest ballB∗enclosing a setXnformed by n points

in Rd. We show that any probability measure on Xn, with mean c and variance matrix

V , provides a lower bound b on the distance to c of any point on the boundary of B∗, with b having a simple expression in terms of c and V . This inequality permits to remove inessential points fromXn, which do not participate to the definition ofB∗, and can be used

to accelerate algorithms for the construction ofB∗. We show that this inequality is, in some sense, the best possible. A series of numerical examples indicates that, when d is reasonably small (d ≤ 10, say) and n is large (up to 105), the elimination of inessential points by a

suitable two-point measure, followed by a direct (exact) solution by quadratic programming, outperforms iterative methods that compute an approximate solution by solving the dual problem.

Keywords: minimum enclosing ball; smallest ball; Chebyshev centre; core sets; optimal design of experiments

AMS Subject Classification: 90C25; 90C46; 62K05

1. Introduction

Given a set of n points Xn = {X1, . . . , Xn} ⊂ Rd, d ≥ 2, we consider the algorithmic

construction of the minimum ballB∗(Xn) enclosingXn. We are interested in particular

in the situation where d is reasonably small but n can be large. For c ∈Rdand r ∈R+,

we denote by Bd(c, r) the (closed) ball {X ∈Rd: kX − ck ≤ r}, with k.k the Euclidean

norm. We shall write B∗(Xn) = Bd(cn∗, rn∗), where c∗n (the Chebyshev centre of Xn)

minimises

f (c) = max

i=1,...,nkXi− ck

2 (1)

with respect to c ∈ Rd and rn∗ = maxi=1,...,nkXi − c∗nk. A ball Bd(c, r) is said to be

a (1 + )-approximation to B∗(Xn),  > 0, when Xn ⊂ Bd(c, r) and r ≤ (1 + )rn∗; a

subset Xq ⊆ Xn is said to be an -core set of Xn if B∗(Xq) =Bd(c∗q, rq∗) is such that

r∗q ≤ r∗

n≤ (1 + )r∗q.

The construction of B∗(Xn) is a classical optimisation problem, for which many

al-gorithms have been proposed in the literature, see, e.g., the historical sketch in [7] and the references in [28]. A recent application concerns the construction of space-filling

(3)

signs for compter experiments based on an extension of Lloyd’s clustering algorithm [16]. Some methods are exact and rely on extensions of linear programming algorithms, see [5, 9, 26]; some use the dual formulation of the problem and construct a sequence of (1 + k)-approximations of B∗(Xn) with k tending to zero, see [4, 28]. The former are

exponential in d and are thus restricted to problems with moderate dimension (d. 20, say); the latter can also solve problems with large d and compute a (1 + )-approximation toB∗(Xn) in O(nd/) arithmetic operations, returning an -core set of size O(1/), see

[6, 28].

Both types of methods can strongly benefit from a reduction of the size ofXn, in the

same way as algorithms for the construction of the minimum-volume ellipsoid containing Xn can be accelerated when inessential points are eliminated by the inequality of [11],

see [24, Sect. 3.6]. The objective of removing inessential points presents some similarities with that of obtaining small -core sets, with one capital difference though: a point Xi is

called inessential when B∗(Xn\ {Xi}) exactly coincides with B∗(Xn), which happens

in particular when Xi lies in the interior of B∗(Xn). By removing inessential points,

we thus aim at constructing small 0-core sets. Although we know there always exists a 0-core set of size at most d + 1, its construction requires the knowledge of B∗(Xn). The

objective of the paper is to derive a simple inequality that any point Xj on the boundary

ofB∗(Xn) must satisfy, without knowingB∗(Xn). More precisely, we show that for any

probability measure ξ onXn, with c(ξ) and V (ξ) the corresponding mean and covariance

matrix respectively, any point Xj on the boundary of B∗(Xn) satisfies

kXj− c(ξ)k2 ≥ trace[V (ξ)] + γ(ξ) −

p

γ(ξ){2trace[V (ξ)] + γ(ξ)} , (2)

where γ(ξ) = maxi=1,...,nkXi − c(ξ)k2 − trace[V (ξ)]. We also prove that this bound on

kXj − c(ξ)k2 is, in some sense, the best possible, and a comparison with the bound previously proposed in [2] is provided. Since algorithms based on the dual formulation of the smallest enclosing ball problem generate a sequence of measures ξk, they provide for free a sequence of inequalities that can be used as sieves to eliminate inessential points fromXn, and thereby generate a sequence of 0-core sets of decreasing size. When

imbedded in the algorithm, these sieves yield an increasing simplification of iterations, and thus an acceleration of the algorithm, see [2]. Moreover, the 0-core set obtained after a few iterations may be small enough to allow the efficient use of an exact quadratic programming (QP) algorithm for the construction of B∗(Xn)

The paper is organised as follows. Section 2 introduces the notation and presents the QP and dual formulations of the minimum enclosing ball problem. The inequality (2) is proved in Section 3, where we also explain why it cannot be improved. Two iterative algo-rithms are presented in Section 4: a multiplicative algorithm inspired from experimental design theory and the vertex-direction algorithm of [28]. Some computational results are presented in Section 5 that illustrate the benefit of the elimination of inessential points, when using an iterative algorithm to solve the dual problem or before using QP for the direct approach. In particular, they indicate that for moderate d the application of an exact QP algorithm to the resulting 0-core set yields the exact minimum ball B∗(Xn)

(4)

2. Quadratic programming and dual formulations

For any Xi ∈ Xn and c and c0 in Rd, we can write kXi− ck2 = kXi− c0k2− 2(Xi−

c0)>(c − c0) + kc − c0k2. Therefore, f (c) defined in (1) can be written as

f (c) = max i=1,...,n n kXi− c0k2− 2(Xi− c0)>(c − c0) o + kc − c0k2, (3)

and its minimisation is equivalent to the minimisation of kc − c0k2 + t with respect to

(c, t) ∈Rd+1, subject to the n linear constraints

kXi− c0k2− 2(Xi− c0)>(c − c0) ≤ t , i = 1, . . . , n . (4)

When d is small enough, simplex-type or projection methods can thus be used to obtain the exact solution in finite time (assuming calculations with infinite precision); see in particular the introduction of [10] and the references therein. In case the QP solver requires a strictly convex problem, one may add a regularisation term, quadratic in t, to the objective function and minimise kc − c0k2+ t + δt2 with δ arbitrarily small (the

solution obtained being however not exact in this case).

On the other hand, the dual formulation of the problem yields iterative methods that construct a sequence of (1+k)-approximations ofB∗(Xn) with ktending to zero, which

are of particular interest when d is large. Direct calculation, using Lagrangian duality, shows that the construction of B∗(Xn) is equivalent to the determination of Lagrange

coefficients that define weights w = (w1, . . . , wn) in the probability simplex

Pn= {w ∈Rn: n X i=1 wi = 1 and wi ≥ 0, i = 1, . . . , n} (5) and maximise φ(w) = trace[V (w)] = n X i=1 wikXi− c(w)k2, (6) where c(w) =Pn

i=1wiXi and V (w) =Pni=1wi[Xi− c(w)][Xi− c(w)]>; see, e.g., [7, 28].

The centre c∗n of B∗(Xn) corresponds to c(w∗) for the optimal weights w∗ maximising

φ(w), and its radius rn∗ equals pφ(w∗). The weights w define a probability measure ξ on

the Xi, and c(w) and V (w) respectively correspond to the mean and variance matrix for

ξ (which, with a slight abuse of notation, we shall also denote by c(ξ) and V (ξ)).

There exist other geometrical problems for which the dual is known to correspond to an optimal design problem, i.e., to the construction of an optimal probability measure on Xn. In particular, the determination of the minimum-volume ellipsoid, with fixed

centre c, containingXn, is equivalent to the D-optimal design problem corresponding to

the maximisation of det[M (w)] with respect to w = (w1, . . . , wn) ∈Pn, with M (w) the

information matrix M (w) = n X i=1 wi(Xi− c)(Xi− c)>

(5)

for the estimation of the d unknown parameters θ in the linear regression model yi = (Xi−

c)>θ + εi, where the εi are i.i.d. observation errors; see [19]. The optimal ellipsoid is given

by {x ∈Rd: (x − c)>M (w∗)(x − c) ≤ 1/d}, with w∗ an optimal vector of weights. When the centre of the ellipsoid is free, the determination of the minimum-volume enclosing ellipsoid forms a D-optimal design problem in Rd+1 [21]: the optimal ellipsoid is given by the intersection between the minimum enclosing ellipsoid, centred at the origin, for the n points (Xi, 1) ∈ Rd+1, and the hyperplane {z ∈ Rd+1 : zd+1 = 1}; see also [17,

Sects. 5.6 & 9.1] and the references therein. The connection between the construction of the thinnest covering cylinder and a Ds-optimal design problem is established in [20] for

cylinders with fixed centre and in [21] when the centre is free.

On the other hand, the maximisation of trace[V (w)] in (6) is not equivalent to an A-optimal design problem, for which one minimises trace[M−1(w)] for some information matrix M (w). As shown in the next section, the connection with an optimal design prob-lem can nevertheless be used to derive the inequality (2), using an approach resembling that in [11].

3. An inequality to eliminate inessential points

Consider the more general situation where X denotes a compact subset of Rd, with Ξ the set of probability measures onX . For any ξ ∈ Ξ, denote

c(ξ) = Eξ(x) = Z X x ξ(dx) and φ(ξ) = trace[Var(ξ)] = Z X kx − c(ξ)k 2ξ(dx) , (7)

so that c(ξ) = c(w) and φ(ξ) = φ(w) in the finite case where X = Xn with wi = ξ(Xi),

i = 1, . . . , n. The dual problem to the determination of B∗(X ) corresponds to the maximisation of φ(ξ) with respect to ξ ∈ Ξ: the centre c∗and radius r∗ ofB∗(X ) satisfy c∗= c(ξ∗) and r∗ =pφ(ξ∗), where ξmaximises φ(ξ) with respect to ξ ∈ Ξ.

3.1 A necessary and sufficient condition for optimality

First note that Ξ is convex: for any ξ, ν ∈ Ξ and α ∈ [0, 1], (1 − α)ξ + αν ∈ Ξ. Denote g(α) = φ[(1 − α)ξ + αν], which is a quadratic function of α. The directional derivative of φ(ξ) at ξ in the direction ν ∈ Ξ is given by

Fφ(ξ; ν) = dg(α) dα α=0 = Z kx − c(ξ)k2ν(dx) − φ(ξ) . (8)

Note that d2g(α)/dα2 = −2kc(ν) − c(ξ)k2 ≤ 0, showing that φ(·) is concave. It is not strictly concave1,but any pair ξa∗ and ξb∗ of optimal measures necessarily satisfy c(ξ∗a) = c(ξb∗), implying that the optimal ball is unique. Concavity implies that ξ∗ ∈ Ξ is optimal if and only if Fφ(ξ∗; ν) ≤ 0 for all ν ∈ Ξ. This is equivalent to Fφ(ξ∗; δx) ≤ 0 for all x ∈X ,

with δx the delta measure at x. Moreover, Fφ(ξ∗; ξ∗) = 0 implies that Fφ(ξ∗; δx) = 0 for

any x in the support of ξ; that is, ξ∗{x ∈ Rd : F

φ(ξ∗; δx) = 0} = 1. We thus obtain the

following property, usually called Equivalence Theorem in experimental design theory

(see, e.g., [8, 12, 14, 18]). When X is finite, the conditions are equivalent to the Karush-Kuhn-Tucker optimality conditions in [28]; see also [7].

1There may exist ξ 6= ν such that φ(ξ) = φ(ν) and c(ξ) = c(ν), and then g(α) is constant for all α ∈ [0, 1] (think for example ofXngiven by the vertices of several regular simplices in Rdall having the same centre).

(6)

Theorem 3.1 The centre of B∗(X ) is given by c(ξ∗), where ξ∗ ∈ Ξ satisfies any of the three following equivalent conditions:

(i) ξ∗ maximises φ(ξ) with respect to ξ ∈ Ξ,

(ii) ξ∗ minimises maxx∈X kx − c(ξ)k2 with respect to ξ ∈ Ξ,

(iii)

kx − c(ξ∗)k2≤ φ(ξ∗) for all x ∈X . (9)

Moreover, kx − c(ξ∗)k2 = φ(ξ∗) for any x in the support of ξ∗.

3.2 (1 + )-approximations and -core sets For any ξ ∈ Ξ, define

γ(ξ) = max

x∈X kx − c(ξ)k

2− φ(ξ) . (10)

Since γ(ξ) = maxx∈X Fφ(ξ; δx), Theorem 3.1 indicates that γ(ξ) ≥ 0 for all ξ ∈X , with

γ(ξ∗) = 0. In some sense, γ(ξ) quantifies the (absolute) suboptimality of the measure ξ. In this section we show how it is related to the (relative) notions of (1 + )-approximation and -core set introduced in Section 1.

Consider the ballB(ξ) = Bd(c(ξ),pγ(ξ) + φ(ξ)). It containsX by construction, and

Theorem 3.1 indicates that the radius of B∗(X ) equals pφ(ξ∗) ≥ pφ(ξ). Therefore,

B(ξ) forms a (1 + )-approximation of B∗(X ) for

 = (ξ) = [1 + γ(ξ)/φ(ξ)]1/2− 1. (11)

LetS (ξ) denote any compact subset of X such that ξ[S (ξ)] = 1 (the support of ξ, say). From Theorem 3.1, the radius r∗(ξ) of the smallest ball enclosing S (ξ) is not smaller than pφ(ξ), so that

p

φ(ξ) ≤ r∗(ξ) ≤pφ(ξ∗) , (12)

where the second inequality follows from S (ξ) ⊂ X . On the other hand, φ(ξ∗) = min

c∈Rdmaxx∈X kx − ck

2≤ max

x∈X kx − c(ξ)k

2 = γ(ξ) + φ(ξ) (13)

(which is also a direct consequence of the concavity of φ(·), which implies that, for any ξ ∈ Ξ, φ(ξ∗) ≤ φ(ξ) + Fφ(ξ; ξ∗) ≤ φ(ξ) + maxx∈X Fφ(ξ; δx) = γ(ξ) + φ(ξ)). Therefore,

the combination of (12) and (13) gives

r∗(ξ) ≤pφ(ξ∗) ≤ [1 + γ(ξ)/φ(ξ)]1/2r(ξ) ,

indicating that S (ξ) is an -core set for  given by (11).

These connections are used in particular in [28] to give a thorough characterisation of the convergence properties of two algorithms that generate sequences of measures ξk, in terms of their associated (1 + k)-approximations and k-core sets. See also Section 4.

(7)

3.3 The inequality

Following an approach similar to [11, 15], we now prove the main result of the paper.

Theorem 3.2 For any compact subset X ⊂ Rd and any probability measure ξ onX , any y ∈X such that

ky − c(ξ)k2< b[φ(ξ), γ(ξ)] = φ(ξ) + γ(ξ) −pγ(ξ)[2φ(ξ) + γ(ξ)] , (14)

where φ(ξ) and γ(ξ) are respectively defined by (7) and (10), is in the interior of the smallest ball B∗(X ) enclosing X .

Proof. Take any ξ in Ξ and consider γ(ξ) defined by (10). Then, kx−c(ξ)k2 ≤ φ(ξ)+γ(ξ) for all x ∈X , which implies that

Z

X

kx − c(ξ)k2ξ(dx) = φ(ξ) + kc(ξ) − c(ξ)k2 ≤ φ(ξ) + γ(ξ) (15)

for an optimal measure ξ∗. Also, (9) implies

Z

X

kx − c(ξ∗)k2ξ(dx) = φ(ξ) + kc(ξ∗) − c(ξ)k2 ≤ φ(ξ∗) . (16)

Consider now any y on the boundary of B∗(X ). From Theorem 3.1 and the triangular inequality, it satisfies ky − c(ξ∗)k =pφ(ξ∗) ≤ ky − c(ξ)k + kc(ξ) − c(ξ)k, that is,

ky − c(ξ)k ≥pφ(ξ∗) − kc(ξ) − c(ξ)k . (17)

We do not know the values of φ(ξ∗) and c(ξ∗), but we can compute a lower bound on the right-hand side of (17), using (15) and (16). Denote u =pφ(ξ∗) and v = kc(ξ) − c(ξ)k.

The set {(u, v) ∈ R2 : u2 + v2 ≤ φ(ξ) + γ(ξ) and u2 − v2 ≥ φ(ξ)} is convex, and the

minimum of u − v is obtained for u =pφ(ξ) + γ(ξ)/2 and v = pγ(ξ)/2;Figure 1 gives an illustration. Therefore, (17) implies that ky − c(ξ)k2≥ b[φ(ξ), γ(ξ)]. 

Figure 1. Determination of the lower bound (14) in the proof of Theorem 3.2: admissible set for (u, v) (coloured) and optimum point minimising u − v (dot).

(8)

Note that b(φ, γ) = φ + γ −pγ[2φ + γ] is decreasing in γ, with b(φ, 0) = φ and limγ→∞b(φ, γ) = 0. The right-hand side of (14) gives the tightest lower bound on ky −

c(ξ)k2 for a y on the boundary ofB∗(X ), in the sense of the following theorem. Theorem 3.3 For any integer d ≥ 2 any γ > 0 and δ > 0, there exist a compact subset X of Rd, a probability measure ξ onX , and a point y on the boundary of B(X ) such

that γ = maxx∈X kx − c(ξ)k2− φ(ξ) and ky − c(ξ)k2 < b[φ(ξ), γ] + δ, with b(φ, γ) as in

Theorem 3.2.

Proof. The proof relies on the construction of an example. The dimension d is irrelevant, and we only need to consider a finite setX3 with three points X1, X2 and X3whose first

two coordinates are respectively (0, −1), (0, 1) and (1 + a, 0), a > 0, with ξ the measure that allocates weights α, α, and 1 − 2α to X1, X2 and X3, α ∈ (0, 1/2). Then, the first

two coordinates of c(ξ) are ((1 − 2α)(1 + a), 0), and φ(ξ) = 2α[1 + (1 + a)2(1 − 2α)]. Also, kX1 − c(ξ)k2− φ(ξ) = kX2 − c(ξ)k2 − φ(ξ) = (1 − 2α)[(1 + a)2(1 − 4α) + 1] and

kX3− c(ξ)k2− φ(ξ) = −2α[(1 + a)2(1 − 4α) + 1], so that γ = kX1− c(ξ)k2− φ(ξ) = kX2−

c(ξ)k2−φ(ξ) for any a ≥ 0 when α < 1/4. For any α < 1/4 and δ > 0, we can then choose a smaller than some h(α, δ) to obtain kX3− c(ξ)k2 < φ(ξ) + γ −pγ[2φ(ξ) + γ] + δ. For

instance, when α = 1/6, we can take a < h(1/6, δ) =p9 δ − 1 + 2√27 δ2+ 9 δ + 1 − 1.

On the other hand, the smallest ball containing {X1, X2} isBd(0, 1), which shows that

X3 is on the boundary ofB∗(X3) since kX3k > 1. 

It is instructive to compare the bound b[φ(ξ), γ(ξ)] in (14) with that derived in [2]. One may first note that (15) and (16) imply that

for any ξ ∈ Ξ , kc(ξ∗) − c(ξ)k2 ≤ γ(ξ)

2 = φ(ξ)

(2 + 2)

2 , (18)

with  given by (11), whereas the simple geometric arguments used in [2] only give kc(ξ∗) − c(ξ)k2≤ φ(ξ) (2 + 2). In the same paper, the authors combine this inequality

with (17) and obtain that any point y on the boundary of B∗(X ) satisfies ky − c(ξ)k ≥pφ(ξ) [1 − (2 + 2)1/2] =pφ(ξ) [1 −pγ(ξ)/pφ(ξ)] .

Note that γ(ξ) must be smaller than φ(ξ) (i.e.,  < √2 − 1) in order to get a positive bound able to eliminate points. To compare this result with Theorem 3.2, denote

bAY[φ(ξ), γ(ξ)] = φ(ξ)[max{1 −

p

γ(ξ)/pφ(ξ), 0}]2; (19)

bAY(φ, γ) is decreasing in γ, with bAY(φ, 0) = φ and bAY(φ, γ) = 0 for γ ≥ φ, and

bAY(φ, γ) < b(φ, γ) given by (14) for any φ > 0 and γ > 0. We can also write bAY(φ, γ) =

φ(ξ)[max{1−(2+2)1/2, 0}]2and b(φ, γ) = φ[(1+)2−{(2+)[1+(1+)2]}1/2], with  the

approximation level  = (1 + γ/φ)1/2− 1, see (11). Figure 2-left presents b(φ, γ)/φ (solid line) and bAY(φ, γ)/φ (dashed line) as functions of  ∈ [0, 1]; the difference between the

two curves is shown on the right part. The superiority of b(φ, γ) compared to bAY(φ, γ)

is also significant for small , so that when approaching the optimum with an iterative algorithm, the elimination of inessential points is likely to be more efficient with (14) than when using the bound in [2]. Note that the computational costs of the two bounds are roughly equivalent.

(9)

Figure 2. b(φ, γ)/φ (solid line, left), bAY(φ, γ)/φ (dashed line, left) and [b(φ, γ) − bAY(φ, γ)]/φ (right) as functions of  = (1 + γ/φ)1/2− 1.

3.4 Effectiveness of the elimination

Take any probability measure ξ on X and consider a point y eliminated by (14), that is, such that ky − c(ξ)k2 < b[φ(ξ), γ(ξ)]. By construction of the bound (14), it satisfies ky − c(ξ∗)k ≤ pφ(ξ∗) (this can be directly checked, using the triangular inequality ky − c(ξ∗)k ≤ ky − c(ξ)k + kc(ξ) − c(ξ)k and the inequalities (16) and (18)). Therefore,

y belongs to B∗(X ) = Bd(c(ξ∗),pφ(ξ∗)). LetI (ξ) denote the set of inessential points

eliminated by (14) and µ denote the Lebesgue measure on X . We thus have

ω(ξ) = µ[I (ξ) ∩ B ∗(X )] µ[B∗(X )] = µ[I (ξ)] µ[B∗(X )] ≤  b[φ(ξ), γ(ξ)] φ(ξ∗) d/2 .

Denote δ(ξ) = γ(ξ)/φ(ξ), and suppose that δ(ξ) = δ > 0. Then, b[φ(ξ), γ(ξ)] = φ(ξ)(1 + δ −pδ(2 + δ)) and Lemma 3.2 of [28] implies that φ(ξ∗) > φ(ξ)(1 + δ2/[4(1 + δ)]). Therefore,

ω(ξ) < hd/2(δ) , (20)

with h(δ) = 4(1 + δ)(1 + δ − pδ(2 + δ))/[4(1 + δ) + δ2] < 1, implying that µ[I (ξ)]/µ[B∗(X )] → 0 as d → ∞. We can thus expect that in general, for points

Xi approximately uniformly distributed in a compact set, the effectiveness of the sieve

formed by (14) will decrease as the dimension d increases. This can be investigated more precisely in some simple situations. Define

α(ξ) = µ[I (ξ)] µ(X ) ,

the proportion of points eliminated by (14), and let ξu denote the uniform probability

measure on X .

X is the d-dimensional ball Bd(0, 1). In that case, X = B∗(X ) and α(ξ) = ω(ξ)

for any ξ. When x ∼ ξu, then kxk has the density ϕ(r) = drd−1, r ∈ [0, 1], and φ(ξu) =

d/(d + 2), γ(ξu) = 1 − φ(ξu) = 2/(d + 2). This gives b[φ(ξu), γ(ξu)] = 1 − 2

(10)

and therefore α(ξu) = bd/2[φ(ξu), γ(ξu)] =  1 −2 √ d + 1 d + 2 d/2 ,

which is a decreasing function of d, the values of α(ξu) being already moderate for small

d, with α(ξu|d = 2) = 1 −

3/2 ' 0.1340 and α(ξu|d = 3) =

5/25 ' 0.0894. Similarly, for the bound (19) of [2] we obtain bAY[φ(ξu), γ(ξu)] = d/(d + 2) (1 −p2/d)2 for d > 2

(and 0 for d = 1, 2). The values of b[φ(ξu), γ(ξu)] and bAY[φ(ξu), γ(ξu)] are plotted against

d in Figure 3-left; the corresponding proportions α(ξu) are presented in Figure 3-right.

Figure 3. b[φ(ξu), γ(ξu)] (stars, left), bAY[φ(ξu), γ(ξu)] (triangles, left) and corresponding proportions α(ξu) of eliminated points (right, log-scale) as functions of d.

X is the hypercube [−1/2, 1/2]d. Direct calculation gives φ(ξ

u) = d/12 and

γ(ξu) = d/4 − φ(ξu) = d/6, so that b[φ(ξu), γ(ξu)] = d(1/4 −

2/6). For d ≤ 17, Bd(c(ξu), b1/2[φ(ξu), γ(ξu)]) ⊂ X , and α(ξu) = bd/2[φ(ξu), γ(ξu)] Vd, with Vd =

vol[Bd(0, 1)] = πd/2/Γ(d/2 + 1) the volume of the d-dimensional unit ball Bd(0, 1).

Again α(ξu) is a decreasing function of d, with α(ξu|d = 2) = (3 − 2

2) π/6 ' 0.0898 and α(ξu|d = 3) = (

2 − 1)3π/6 ' 0.0372. Note that (19) does not permit to eliminate any point since γ(ξu) > φ(ξu).

Although (20) indicates that the effectiveness of the elimination of inessential points decreases with d for a fixed δ (that is, for a fixed level of approximation 1 +  =√1 + δ, see Section 3.2), the proportion α(ξ) can be significant when ξ approaches optimality (so that δ = δ(ξ) is small enough in (20)). In particular, algorithms for the solution of the dual formulation of the smallest enclosing ball problem generate sequences of measures ξk that can be used as sieves to progressively eliminate points. Two such methods are presented in the next section.

4. Algorithms for the dual

4.1 A multiplicative algorithm

We return to the case of a finite set Xn, with wi = ξ(Xi) the weight allocated by the

(11)

w0

i = 1/n, consider the application of successive iterations of the form

wik+1=wbik+1= wik kXi− c(w

k)k2

Pn

j=1wjkkXj− c(wk)k2

, i = 1, . . . , n . (21)

This type of algorithm is called multiplicative in the literature on optimal experimental design: the weights wki of the measure ξk at iteration k are simply multiplied by positive factors fi(wk)/Pkj=1wkjfj(wk), with here fi(wk) = kXi − c(wk)k2 = dφ(w)/dwi

w=wk. In the case of D-optimal design, similar iterations ensure monotonic convergence to the minimum-volume ellipsoid containing Xn, see [22, 23, 29]. Here the iteration (21) does

not guarantee that φ(wk+1) > φ(wk) for all non-optimal wk, and, following [27], we

consider iterations of the (more general) form

wik+1=we k+1 i (βk) = w k i [1 + βkFφ(ξk; δXi)] = wki {1 + βk[kXi− c(wk)k2− φ(wk)]} , (22)

where βk≥ 0, Fφ(ξ; ν) is the directional derivative defined in (8), and where ξk allocates

weight wki to Xi, i = 1, . . . , n. Note thatPni=1we

k+1

i (βk) = 1 and that allwe

k+1

i (βk) remain

non-negative if βk is small enough. Also note thatwe

k+1

i [1/φ(wk)] =wb

k+1

i given by (21).

The iteration (22) corresponds to a projected second-order method for the maximisation of φ(w), see [27] and [17, Sect. 9.1], and there always exists a step-size βk> 0 such that

φ(wk+1) > φ(wk) when w∗ is not optimal. Since here φ[weik+1(βk)] is quadratic in βk, the

maximising value βk∗ can be calculated explicitly and is given by

βk∗ = Pk i=1wb k+1 i [kXi− c(wk)k2− φ(wk)] 2φ(wk) kc( b wk+1) − c(wk)k2 , (23)

where the components ofwbk+1are given by (21). Since the iteration (21) is simpler than (22)-(23), it is advisable to always try the former first, and switch to the latter only if (21) does not yield an increase of φ(·) (numerical experimentation indicates that this is rather exceptional). To ensure that all components ofweik+1(βk) remain non-negative, we

should normally take βk = min{β∗k, βk,max}, where βk,max = [φ(wk) − minj=1,...,nkXj−

c(wk)k2]−1≥ 1/φ(wk), see (22). However, from the quadratic dependence of φ[

e wk+1i (βk)] in βk, φ(wb k+1 i ) ≤ φ(wk) is equivalent to 1/φ(wk) ≥ 2 eβ ∗

k and thus implies βk,max≥ 2β∗k.

The construction is summarised in Algorithm 1.

Algorithm 1 stops when a (1 + k)-approximation of B∗(Xn) is obtained, with k =

p

1 + γ(wk)/φ(wk) − 1 < . The sequence {φ(wk)} is monotonically increasing, but the

investigation of its convergence properties as k → ∞ is out of the scope of this paper and will be considered elsewhere. The complexity of each iteration is roughly proportional to n, and the algorithm may benefit from the elimination of inessential points using the results of Section 3.3. This is considered in the next section.

4.2 Elimination of inessential points by the multiplicative algorithm

The uniform measure, with w0i = 1/n for all i, used to initialise Algorithm 1 can be used to eliminate inessential points from Xn. For a given n, the proportion α(w0) of points

that can be eliminated depends on the precise location of the Xi, but we can consider

(12)

Algorithm 1 Multiplicative algorithm for the smallest enclosing ball problem Require: Xn a set of n points inRd and  > 0.

Set w0i = 1/n for i = 1, . . . , n; k ← 0; compute c(w0), φ(w0) and γ(w0). while γ(wk)/φ(wk) > (1 + )2− 1 do

compute wbik+1 given by (21), compute c(wbk+1) and φ(wbk+1); if φ(wb

k+1

i ) > φ(wk) then set wk+1=wb

k+1;

else compute wik+1=weik+1(βk∗) given by (22)-(23), compute c(wk+1) and φ(wk+1); end if compute γ(wk+1), k ← k + 1; end while return wk, c(wk), k= p 1 + γ(wk)/φ(wk) − 1

compact setX ⊂ Rdwith strictly positive d-dimensional Lebesgue measure µ and equal to the closure of its interior. The Xi may be independently identically distributed inX

with the probability measure ξu = µ/vol(X ), with vol(X ) the volume of X , or they

may correspond to the first n points of a low-discrepancy sequence on X , see, e.g., [13, Chap. 3]. In both situations,

lim n→∞α(w 0) = α(ξ u) = ξunBd  c(ξu), b1/2[φ(ξu), γ(ξu)]  ∩Xo,

where b(φ, γ) is given by (14) and the convergence is almost sure when the Xi are i.i.d.

The values of α(ξu) obtained in Section 3.4 for the case whereX is a d-dimensional ball

or hypercube suggest that the elimination of inessential points via (14) will be generally not very effective when using ξu only. Below we investigate how the situation improves

when applying several iterations (21).

In terms of probability measure, the iteration (21) can be written as

ξk+1(dx) = kx − c(ξ

k)k2ξk(dx)

R

y∈X ky − c(ξk)k2ξk(dy)

, x ∈X .

When initialised at the uniform measure ξu on X , it corresponds to the limiting

be-haviour of (21) as n → ∞ for points Xi uniformly distributed inX . When 0 is a centre

of symmetry for X , φ(ξk+1) > φ(ξk), c(ξk) = 0 and maxx∈X kx − c(ξk)k2 = M for

all k, with M = 1 when X = Bd(0, 1) and M = d/4 when X = [−1/2, 1/2]d.

Di-rect calculation gives b[φ, M − φ] = M −pM2− φ2, which is increasing in φ, so that

α(ξk+1) > α(ξk).

Consider the case X = Bd(0, 1). After k iterations, φ(ξk) =

R1

0 r 2ϕ

k(r)dr, with

ϕk(r) = (d + 2k) rd−1+2k, which gives φ(ξk) = (d + 2k)/(d + 2k + 2). The proportion of

points eliminated by (14) after those k iterations is

α(ξk) =n1 − [1 − φ2(ξk)]1/2od/2=  1 − 2 √ d + 1 + 2k d + 2 + 2k d/2 , (24)

which is decreasing in d for fixed k, but increases in k for fixed d, with limk→∞α(ξk) = 1.

The value of αk slightly improves when inessential points are removed after each it-eration, provided the mass of eliminated points is suitably distributed on the

(13)

ing ones. Suppose for instance that we simply renormalise the total mass of remain-ing points. Then, at iteration k ≥ 1, φ(ξk) = RA11/2k−1)r2ϕk(r)dr, where ϕk(r) =

(d + 2k) [1 − A(d+2k)/2k−1)]−1rd−1+2k, r ∈ [A1/2k−1), 1], with A(ξ) = 1 −p1 − φ2(ξ). This gives φ(ξk) = d + 2k d + 2(k + 1) 1 − Ad/2+k+1(ξk−1) 1 − Ad/2+kk−1) and α(ξ k) = Ad/2k) , k ≥ 1 . (25)

Numerical evaluations for different d and k indicate that α(ξk) is only marginally larger than the value in (24), with the consequence that trying to remove inessential points at each iteration of Algorithm 1 is generally not very efficient.

4.3 A vertex-direction algorithm

Algorithm 4.1 of [28] is similar to the algorithm of [25] for the construction of the mini-mum ellipsoid containing Xn and to the algorithm proposed in [3] for the construction

of a D-optimal design measure. The detailed analysis in [28] indicates in particular that the algorithm asymptotically presents linear convergence; see also [1]. An initialisation at a two-point measure is proposed,

ξ2= (1/2)(δXi1+ δXi2) , with i1= arg max

i=1,...,nkXi− X1k and i2 = arg maxi=1,...,nkXi− Xi1k ,

(26) so that wi1 = wi2 = 1/2 and wi = 0 for all i 6= i1, i2 (when the order of indices is randomised, X1 can be considered as randomly drawn among the Xi). This construction

ensures that Xi1 and Xi2 will be far apart, without requiring the computation of all n(n−1)/2 pair distances. It is a key argument in the complexity analysis of the algorithm. Direct calculation gives φ(ξ2) = kXi1− Xi2k

2/4.

The method is summarised in Algorithm 2 below, with two small modifications com-pared with the original version in [28]: (i) the choice between a plus-iteration (displace-ment in the direction of the furthest point Xi+ to the current center c(wk)) or a minus-iteration (reduction of the weight allocated to the closest point Xi− to c(wk) among the current support J (wk)) is based on the comparison between the values of φ(wk+1) corresponding to these two options, whereas [28] simply compares γ(wk) with γ−(wk); (ii) the algorithm is stopped when γ(wk)/φ(wk) ≤ (1 + )2− 1, whereas the condition is max{γ(wk), γ−(wk)}/φ(wk) ≤ (1 + )2 − 1 in [28]. These minor differences do not modify the complexity analysis in the same paper, and the algorithm returns a (1 + )-approximation in 18 + 50/ iterations at most.

The two-point measure ξ2 defined by (26) can also be used to eliminate inessential

points. Let Xi∗ denote the furthest point in Xn from c(ξ2) = (Xi

1 + Xi2)/2. Then,

kXi∗− c(ξ2)k ≤ σ kXi

2 − Xi1k for some σ > 0 implies that γ(ξ2)/φ(ξ2) ≤ 4σ

2− 1 and

thus

b[φ(ξ2), γ(ξ2)]

φ(ξ2)

≥ τ2 = 4σ2−p16σ4− 1 .

Any point Xisuch that kXi−c(ξ2)k < (τ /2) kXi2−Xi1k is thus in the interior ofB

(X n).

On the other hand, note that the bound bAY[φ(ξ2), γ(ξ2)] given by (19) is informative

only when σ < √2/2 (to ensure that γ(ξ2) < φ(ξ2)). Since τ is decreasing in σ, the

(14)

Algorithm 2 Vertex-direction algorithm for the smallest enclosing ball problem

Require: Xn a set of n points inRd and  > 0.

Set wi01 = w0i2 = 1/2 and wi0 = 0 for all i 6= i1, i2, where i1 and i2 are given by (26);

k ← 0; Set c(w0) = (Xi1+ Xi2)/2, φ(w 0) = kX i1− Xi2k 2/4, J (w0) = {i 1, i2}, γ−(w0) = 0,

i− = 1, compute γ(w0) and i+= arg maxi=1,...,nkXi− c(w0)k.

while γ(wk)/φ(wk) > (1 + )2− 1 do

if γ(wk) > γ−(wk)/[1 − γ−(wk)/φ(wk)] then compute αk= γ(wk)/{2[φ(wk) + γ(wk)]},

set wik+1+ = (1 − αk)wik++ αk and wk+1i = (1 − αk)wki for all i 6= i+,

compute c(wk+1) = (1 − αk)c(wk) + αkXi+; else

compute αk= minγ−(wk)/{2[φ(wk) − γ−(wk)]}, wik−/(1 − wik−) , set wik+1− = (1 + αk)wik− − αk and wik+1= (1 + αk)wik for all i 6= i−,

compute c(wk+1) = (1 + α k)c(wk) − αkXi−; if αk= wki−/(1 − wik−) then J (wk+1) =J (wk) \ {i} else J (wk+1) =J (wk) end if end if

compute φ(wk+1), γ(wk+1) and i+ = arg max

i=1,...,nkXi− c(wk+1)k,

i− = arg mini=1∈J (wk+1)kXi − c(wk+1)k and γ−(wk+1) = φ(wk+1) − kXi− − c(wk+1)k; k ← k + 1; end while return wk, c(wk),  k= p 1 + γ(wk)/φ(wk) − 1

X = Bd(0, 1) orX = [−1/2, 1/2]d we can take σ = 1/2, which gives τ = 1: all points in

the interior of Bd(c(ξ2), kXi2 − Xi1k/2) are eliminated (and ξ2 is optimal whatever the

choice of X1 inX ). More generally, Lemma 3.1 in [28] gives σ = 3/2 for any Xn, since

kXi∗−c(ξ2)k ≤ kXi∗−Xi 1k+kXi1−c(ξ2)k ≤ kXi2−Xi1k+ 1 2kXi1−Xi2k = 3 2kXi1−Xi2k .

This bound is not tight, however: equality can only be achieved when Xi∗, Xi

1 and Xi2

are aligned, with Xi1 between Xi∗ and Xi2, which contradicts the fact that Xi1 is the

furthest point in Xn from some X1. A more precise analysis, see Appendix A, yields

σ =√7/2, and the corresponding bound is tight. This indicates that, for any setXnand

for any point X1∈Xn used for the construction of ξ2, any Xi such that

kXi− c(ξ2)k < 0.133974 kXi2− Xi1k <

q

7 − 4√3 kXi2 − Xi1k/2 (27)

can always be eliminated2. In practice, kXi∗−c(ξ2)k is often much smaller than

7 kXi1−

Xi2k/2, and ξ2proves generally more efficient than the uniform measure ξufor eliminating

inessential points. This is illustrated in the next section.

2Although the value σ =7/2 gives a tight bound, one may notice that the inequality (27) is suboptimal since the worst-case situations in Theorem 3.3 and Lemma A.1 correspond to different measures.

(15)

5. Computational results

Methods to be compared. In this section, we report the results of computational ex-periments comparing different methods for the construction of B∗(Xn). The first one

(henceforth QP) corresponds to the direct application of the QP solver of Matlab (the function qp.m) to the minimisation of (3), see Section 1. In the method QP0, we first

eliminate inessential points using the sieve (14) for ξ2 given by (26) and then apply the

same QP solver.

The choice of c0 in (3) is arbitrary, and c0 = c(ξu) = (1/n)Pni=1Xi seems natural.

However, we found that c0has a significant influence on the computational time, and that

taking c0 out of the convex hull Conv(Xn) ofXngenerally yields a faster computation of

the optimal solution. Note that, when c0 ∈ Conv(/ Xn), for any t ∈R there exists a c ∈ Rd

satisfying the constraints (4) (and the set of such feasible c is unbounded). On the other hand, no feasible c exists for small enough t when c0 ∈ Conv(Xn). In our computations

we take c0 = 2 Xia− Xib, where ia= arg maxi=1,...,nu >X

i and ib= arg mini=1,...,nu>Xi,

with u> = (1, 0, . . . , 0) (the choice of u does not seem important). The QP solver is initialised at (c(ξu), 0) (which is not necessarily feasible for (4)).

We also consider the iterative construction of an (1 + )-approximation of B∗(Xn),

using Algorithms 1 and 2 (henceforth A1 and A2), both with  = 10−3. A1 and A2 do not eliminate any point. As noticed in Section 4.2, it is not very efficient to try to eliminate inessential points at each iteration of A1. Our experiments indicate that a suitable compromise between the computational cost of the elimination test and the benefit of reducing the dimension of w is obtained when the sieve (14) is used about every 5 iterations of A1 or A2; the corresponding methods are denoted by A15 and A25,

respectively. For each of them, inessential points are also eliminated at the initialisation, using (14) with ξ2. A105 and A205 differ from A15 and A25 by the stopping rule only: they

are stopped when an (1+)-approximation is obtained or earlier if n−2d inessential points have already been eliminated. In case of early stopping, QP applied to the resulting 0-core set will thus have to deal with 2d constraints only (the value 2d is somewhat arbitrary, but seems reasonable for most situations since B∗(Xn) has d + 1 points at most on its

boundary when the n points in Xn are in general position). In A105-QP and A205-QP

we apply QP to the 0-core sets returned by A105 and A205, respectively. Finally, A2∗5 is similar to A25 but uses  = 10−6, and thus returns an (1 + )-approximation very close to

the exact B∗(Xn) given by QP, QP0, A105-QP and A205-QP. We shall call these methods

(including A2∗5) exact in what follows.

When using A1 or A2, points that are eliminated by (14) for the current measure ξk may carry a positive weight wik, and the weights of remaining points then need to be renormalised. Denote by Ik the set of indices of those remaining points; following [11],

we replace wki by zik/(Pn

j=1zjk), where zik= 0 for i /∈Ik, zki = 1.1 wikif kXi− c(wk)k2≥

φ(wk) and zki = wki otherwise (i ∈Ik and kXi− c(wk)k2 < φ(wk)).

Measures of performance. The experiments were carried out on a PC with a clock speed of 2.50 GHz and 32 Go RAM.

We first compare (Tables 1, 4 and 7) the effectiveness of the sieve (14) for the uniform measure ξu used to initialise A1 and for ξ2 given by (26): π(ξ) = 1 − α(ξ) gives the

proportion of points that are not eliminated by ξ. To compare the efficiency of (14) with that of the bound (19) proposed in [2], we also give the value πAY(ξ2) obtained

when bAY[φ(ξ2), γ(ξ2)] is used instead of b[φ(ξ2), γ(ξ2)]. We also indicate the number

(16)

inessential points.

In Tables 2, 5 and 8, N gives the number of iterations performed to reach the required

precision  for A1, A15, A2, A25 ( = 10−3) and A2∗5 ( = 10−6), or to eliminate at least

n − 2d points for A105 and A205.

Finally, in Tables 3, 6 and 9 we compare the computational times of the different methods considered, with t(QP), the computational time of QP, taken as a reference: for each method M other than QP, with computational time t(M), we indicate the ratio ρ(M)=t(M)/t(QP).

n consecutive points of Sobol’ low-discrepancy sequence in [0, 1]d. Table 1 indicates that ξ2 is much more effective than the uniform measure ξu for eliminating points with

(14) when d is not too large, d. 10 say; one may note the good agreement between π(ξu)

and the theoretical value π∗ = 1 − [π d(1/4 −√2/6)]d/2/Γ(d/2 + 1) (d ≤ 17) derived in Section 3.4. For d between 3 and 10, πAY(ξ2) is most often significatively larger than

π(ξ2), which illustrates the superiority of the bound (14) over (19). The number of

remaining points after running A15 or A25 are very close in most cases. Exceptions, like

n = 103 and n = 104for d = 3 and n = 105 for d = 4, correspond to situations where A25

is used for less than 5 iterations, so that inessential points are only eliminated once (at the initialisation) whereas A15 makes much more iterations, see Table 2 (when less than

5 iterations are done, then κ = n π(ξ2)). As expected, κ(A2∗5) is smaller than κ(A25) in

all cases, and Table 1 indicates that A2∗5 is able to provide small 0-core sets for the sets Xn considered.

Table 2 shows that the elimination of inessential points does not directly influence the number of iterations required to reach a given precision: N(A15) is often smaller

than N(A1), but not always; the effect on A2 is limited. A15 requires systematically

more (sometimes much more) iterations than A25 to reach the required precision ,

which can be related to the general observation that multiplicative algorithms tend to be slow close to the optimum. This is consistent with the observations that sometimes A105 requires significantly less iterations than A15, whereas N(A205) is close to N(A25) in all

circumstances: A15 may have reached an (1 + 0)-approximation, 0 > , close enough to

the optimum to be able to eliminate many points, but may still require many iterations to reach an (1+)-approximation. The number of iterations of A2∗5 ( = 10−6) shows a great variability among the cases considered, and the large values obtained for d = 2, n = 103 and n = 104may look surprising. However, they do not contradict the complexity bound N(A2) < 18 + 50/ of [28] and can be explained by the potential slow convergence of

first-order methods close to the optimum. A simple example with d = 2 and n = 4 gives an illustration.

Take Xn = {X1, X2, X3, X4} with X1 = (1 − a, a)>, X2 = (a, 1 − a)>, X3 = (0, 0)>

and X4 = (1, 1)>, a < 1/2. When a < 1/2 −

3/6, then kX1− X2k > kX1− X3k, so

that i1 = 2 and i2 = 1 in (26). The initial w0 of A2 is thus (1/2, 1/2, 0, 0), and A2 may

require many iterations to reach precision  depending on the value of a. For instance, for  = 10−5, N(A2)=6252 when a = 10−3 and N(A2)=62502 when a = 10−4 (whereas

N(A1)=7361 and N(A1)=1 for a = 10−3 and a = 10−4, respectively).

A noticeable observation from Table 3 is that a standard QP solver gives the solution in reasonable time if n is not too big, even for rather large d. A105 (respectively, A205) is slightly faster than A15 (respectively, A25) since it is stopped earlier; the comparison

with A1 (respectively, A2) shows that the elimination of points significantly accelerates convergence3. Since A15 and A25 only provide (1 + )-approximations with  = 10−3,

(17)

Table 1. Sobol’ sequence in [0, 1]d: proportion π (in %) of points not eliminated and number κ(M) of remaining points after applying method M.

d n π∗ π(ξu) π(ξ2) πAY(ξ2) κ(A15) κ(A25) κ(A2∗5)

2 103 91.02 91.0 0.4 0.4 4 4 4 104 91.02 91.04 0.04 0.04 4 4 4 105 91.02 91.03 0.004 0.004 4 4 4 3 103 96.28 96.66 1.3 2.40 5 13 4 104 96.28 96.34 1.23 3.96 11 123 4 105 96.28 96.30 0.060 0.136 45 60 3 4 103 98.39 98.30 17.2 44.8 8 7 3 104 98.39 98.41 0.02 0.02 2 2 2 105 98.39 98.39 2.318 10.359 32 2318 5 5 103 99.28 99.30 34.1 70.5 10 8 5 104 99.28 99.28 19.16 63.45 20 16 6 105 99.28 99.28 5.976 29.445 27 16 5 10 103 99.98 99.8 75.8 99.4 13 13 8 104 99.98 99.95 85.34 99.96 28 30 10 105 99.98 99.98 45.128 95.730 40 48 11 20 103 99.9 99.7 100.00 34 34 13 104 99.99 98.56 100.00 52 57 14 105 99.999 95.217 99.999 53 40 11 30 103 99.9 99.9 100.00 28 28 12 104 99.99 99.98 100.00 42 48 14 105 99.999 98.897 100.00 98 108 16 40 103 99.9 100.00 100.00 46 33 13 104 99.99 100.00 100.00 60 71 14 104 99.999 99.989 100.00 162 121 19 50 103 99.9 100.00 100.00 43 51 15 104 99.99 100.00 100.00 77 113 17 105 99.999 100.00 100.00 185 155 27

comparing their computational time with that of QP is unfair. A105-QP is sometimes faster than QP, but is always slower than A205-QP, which is often faster than QP and sometimes the fastest among the exact methods considered. A2∗5 is seldom the fastest among exact methods and is often much slower than QP. In this example, QP0 is faster

than QP for n ≤ 10 and slightly slower when n ≥ 20 (i.e., when few points are eliminated by ξ2); it is frequently the fastest exact method when n ≤ 5.

n points i.i.d. N (0, Id). Table 4 indicates that the elimination of inessential points

is more efficient with A15 than A25, and that both methods are able to provide small

0-core sets. For d . 10, πAY(ξ2) is significatively larger than π(ξ2), confirming the

su-periority of the bound (14) over (19). Table 5 gives the same indications as Table 2: sometimes A105 requires significantly less iterations than A15, an indication of the slow

convergence of the multiplicative algorithm near the optimum. Also, N(A15)>N(A25)

and N(A105)>N(A205) in all cases. One may notice the large values of N(A2∗5). Table 6

shows that QP0 and A25-QP are often the fastest among exact methods, which is never

the case for A2∗5. QP0 shows remarkably stable performance and is significantly faster

than QP when n ≤ 5 (i.e., when the elimination of inessential points by ξ2is effective, see

Table 4) and is only slightly slower than QP for n ≥ 10. QP is the fastest exact method for n small enough (n ≤ 103) when d ≥ 10 and for all n ≥ 103 when d is large (d ≥ 40).

n points i.i.d. uniformly in Bd(0, 1). This corresponds to a difficult situation for

algorithms 1 and 2, and due to the larger computational times required compared to previous examples we only consider d ≤ 40 (and n ≤ 104 for d = 40). Table 7 shows that

(18)

Table 2. Sobol’ sequence in [0, 1]d: number N

of iterations per-formed to reach precision  = 10−3( = 10−6for A2∗5).

d n A1 A15 A105 A2 A25 A205 A2 ∗ 5 2 103 44 1 0 0 0 0 32263 104 173 1 0 0 0 0 32263 105 266 0 0 0 0 0 0 3 103 80 270 70 3 3 3 12 104 253 169 169 3 3 3 34 105 242 219 219 1 1 1 7 4 103 91 84 75 5 6 5 14 104 94 0 0 0 0 0 0 105 229 123 123 4 4 4 818 5 103 93 92 75 29 22 20 76 104 212 88 88 63 81 81 178 105 179 107 107 50 55 55 465 10 103 89 139 40 62 56 35 457 104 175 97 97 69 79 79 446 105 200 137 137 66 74 74 930 20 103 241 139 115 89 89 85 714 104 166 152 152 44 37 37 348 105 244 142 142 61 36 35 301 30 103 286 204 50 37 28 25 373 104 336 237 210 87 63 55 1007 105 342 222 222 76 66 66 959 40 103 206 117 80 28 26 15 132 104 115 99 90 60 56 50 311 105 359 188 188 76 45 45 744 50 103 153 103 60 56 44 30 336 104 191 154 125 56 54 54 617 105 266 143 143 93 79 79 1726

Table 3. Sobol’ sequence in [0, 1]d: computational time t(QP) (in s) and ratios ρ(M)=t(M)/t(QP) — averaged over 10 repetitions. Italicized figures correspond to the fastest exact method.

d n t(QP) QP0 A1 A15 A105 A2 A25 A205 A1 0 5-QP A2 0 5-QP A2 ∗ 5 2 103 0.006 0.40 2.13 0.27 0.22 0.16 0.14 0.14 0.37 0.29 864.1 104 0.030 0.08 3.27 0.08 0.07 0.04 0.05 0.05 0.10 0.08 164.1 105 0.27 0.06 4.69 0.07 0.07 0.04 0.06 0.06 0.08 0.06 0.06 3 103 0.004 0.49 4.11 11.05 3.20 0.32 0.30 0.30 3.43 0.59 0.70 104 0.029 0.10 5.19 1.08 1.07 0.10 0.08 0.08 1.10 0.12 0.27 105 0.28 0.06 4.43 0.20 0.20 0.06 0.07 0.07 0.20 0.07 0.07 4 103 0.004 0.56 4.75 4.23 3.99 0.44 0.49 0.45 4.25 0.68 0.83 104 0.029 0.08 1.96 0.09 0.08 0.05 0.06 0.06 0.11 0.09 0.06 105 0.32 0.10 4.32 0.17 0.17 0.11 0.07 0.07 0.18 0.10 0.45 5 103 0.005 0.92 4.06 3.80 3.20 1.30 1.03 0.97 3.48 1.25 2.81 104 0.031 0.28 4.22 0.67 0.66 1.20 0.52 0.53 0.70 0.57 1.04 105 0.32 0.14 4.08 0.20 0.20 1.05 0.12 0.12 0.21 0.13 0.31 10 103 0.007 0.97 3.18 4.12 1.64 2.28 1.79 1.24 2.05 1.63 11.6 104 0.040 0.95 3.61 0.93 0.93 1.36 0.49 0.49 1.01 0.57 1.94 105 0.40 0.57 7.03 0.64 0.64 2.16 0.26 0.26 0.65 0.27 0.58 20 103 0.011 1.06 6.18 3.02 2.62 2.21 1.73 1.65 3.21 2.20 10.9 104 0.059 1.14 6.17 1.76 1.77 1.70 0.47 0.48 1.88 0.58 1.33 105 0.55 1.15 12.71 1.99 2.01 3.00 0.50 0.50 2.02 0.51 0.58 30 103 0.015 1.08 5.66 2.78 1.05 0.76 0.52 0.49 1.65 1.06 3.96 104 0.075 1.18 15.65 2.30 2.26 4.02 0.73 0.72 2.38 0.84 2.70 105 0.74 1.22 23.86 2.80 2.80 5.05 0.93 0.93 2.82 0.94 1.12 40 103 0.021 1.05 3.38 1.56 1.23 0.52 0.42 0.31 1.93 0.94 1.19 104 0.092 1.16 6.52 2.58 2.58 3.28 0.99 0.98 2.78 1.16 1.46 105 0.92 1.28 27.89 4.06 4.05 5.68 1.04 1.03 4.07 1.06 1.14 50 103 0.030 1.04 2.02 1.04 0.77 0.71 0.45 0.36 1.49 1.06 1.98 104 0.12 1.19 11.88 3.21 3.18 3.44 1.13 1.14 3.42 1.36 1.93 105 1.19 1.24 21.67 4.24 4.23 7.31 1.14 1.14 4.26 1.17 1.36

(19)

Table 4. Xii.i.d.N (0, Id): proportion π (in %) of points not eliminated and number κ(M) of remaining points after applying method M — averaged values over 100 repetitions, rounded to the nearest integer.

d n π(ξu) π(ξ2) πAY(ξ2) κ(A15) κ(A25) κ(A2∗5)

2 103 93.40 12.81 40.23 4 29 10 104 94.86 9.00 37.00 5 240 65 105 95.92 3.61 25.69 6 495 21 3 103 96.70 32.94 75.37 5 52 9 104 97.68 18.59 66.15 6 112 17 105 98.25 9.19 50.79 8 129 4 4 103 98.21 46.39 84.81 7 20 5 104 98.77 34.30 82.96 8 127 5 105 99.16 24.00 78.00 10 507 320 5 103 98.89 62.18 92.52 8 22 5 104 99.31 47.69 94.04 10 81 5 105 99.54 33.27 84.58 12 17 5 10 103 99.78 93.52 99.93 13 15 8 104 99.91 89.03 99.99 17 26 8 105 99.96 81.00 99.92 22 25 9 20 103 99.99 99.94 100.00 22 24 12 104 100.00 99.79 100.00 32 33 13 105 100.00 99.39 100.00 42 45 14 30 103 100.00 100.00 100.00 30 31 16 104 100.00 100.00 100.00 45 46 18 105 100.00 100.00 100.00 64 67 20 40 103 100.00 100.00 100.00 39 40 18 104 100.00 100.00 100.00 59 63 21 105 100.00 100.00 100.00 86 92 24 50 103 100.00 100.00 100.00 48 49 21 104 100.00 100.00 100.00 74 77 24 105 100.00 100.00 100.00 107 112 28

Table 5. Xii.i.d.N (0, Id): number Nof iterations performed to reach precision  = 10−3 ( = 10−6 for A2∗5) — averaged values over 100 repetitions, rounded to the nearest integer.

d n A1 A15 A105 A2 A25 A205 A2∗5 2 103 80 126 95 27 23 22 91 104 84 288 187 38 36 36 291 105 114 132 123 40 35 35 74 3 103 84 98 55 36 33 26 86 104 99 83 68 48 40 37 92 105 112 107 100 51 46 45 170 4 103 107 87 59 46 40 34 385 104 110 86 66 47 40 37 142 105 127 90 82 74 63 61 288 5 103 104 82 54 50 46 37 191 104 122 93 74 64 52 47 199 105 136 100 91 88 74 71 333 10 103 125 93 52 59 49 36 320 104 163 112 91 78 62 56 349 105 175 124 116 86 67 64 465 20 103 162 121 57 64 56 35 334 104 194 136 94 87 71 58 602 105 222 156 144 99 82 78 754 30 103 169 124 53 72 62 37 465 104 228 153 110 92 81 65 813 105 249 168 155 117 94 89 1200 40 103 168 119 61 68 64 35 532 104 229 159 105 93 81 65 1054 105 280 182 170 114 96 93 1472 50 103 176 123 63 80 73 40 723 104 234 151 116 106 89 70 1171 105 284 177 169 117 100 95 1856

(20)

Table 6. Xii.i.d.N (0, Id): computational time t(QP) (in s) and ratios ρ(M)=t(M)/t(QP) — averaged over 100 repetitions. Italicized figures correspond to the fastest exact method.

d n t(QP) QP0 A1 A15 A105 A2 A25 A205 A1 0 5-QP A2 0 5-QP A2 ∗ 5 2 103 0.005 0.52 4.01 5.98 4.66 1.43 1.24 1.16 4.87 1.39 4.15 104 0.030 0.18 1.61 1.60 1.15 0.71 0.27 0.27 1.18 0.32 1.71 105 0.27 0.09 1.79 0.16 0.15 0.58 0.08 0.08 0.16 0.09 0.10 3 103 0.004 0.73 4.16 4.61 2.72 1.87 1.65 1.37 2.95 1.66 3.69 104 0.029 0.28 2.03 0.61 0.53 0.92 0.31 0.30 0.56 0.34 0.59 105 0.29 0.16 2.18 0.17 0.17 0.85 0.10 0.10 0.17 0.11 0.17 4 103 0.005 0.83 4.89 3.74 2.69 2.14 1.80 1.60 2.95 1.88 14.28 104 0.030 0.43 2.18 0.65 0.54 0.88 0.32 0.31 0.58 0.36 0.84 105 0.30 0.31 2.54 0.20 0.20 1.33 0.13 0.13 0.20 0.14 0.24 5 103 0.004 0.95 4.83 3.70 2.62 2.37 2.04 1.74 2.93 2.08 7.14 104 0.032 0.56 2.43 0.70 0.61 1.20 0.40 0.37 0.66 0.42 1.06 105 0.32 0.42 3.06 0.25 0.25 1.79 0.16 0.16 0.26 0.16 0.28 10 103 0.006 1.10 4.58 3.11 1.99 2.16 1.66 1.40 2.43 1.84 8.18 104 0.039 0.96 3.41 0.85 0.76 1.57 0.45 0.42 0.83 0.49 1.55 105 0.40 0.92 6.19 0.63 0.63 2.77 0.30 0.30 0.64 0.31 0.46 20 103 0.010 1.08 4.23 2.51 1.44 1.70 1.23 0.88 2.00 1.43 5.35 104 0.061 1.12 6.87 1.12 1.01 3.01 0.58 0.54 1.12 0.65 1.89 105 0.58 1.16 10.94 1.21 1.21 4.45 0.51 0.52 1.22 0.53 0.69 30 103 0.019 1.06 2.78 1.51 0.83 1.17 0.82 0.56 1.40 1.11 4.06 104 0.081 1.12 9.63 1.36 1.27 3.83 0.75 0.72 1.42 0.87 2.15 105 0.77 1.19 16.42 1.75 1.75 7.18 0.80 0.79 1.77 0.81 1.02 40 103 0.028 1.04 2.09 1.09 0.69 0.83 0.61 0.42 1.37 1.09 3.24 104 0.10 1.14 11.30 1.63 1.54 4.33 0.90 0.88 1.77 1.09 2.38 105 0.97 1.20 20.39 2.15 2.15 7.83 1.05 1.04 2.18 1.06 1.27 50 103 0.038 1.04 1.79 0.89 0.60 0.77 0.55 0.38 1.34 1.10 3.21 104 0.13 1.16 12.71 1.84 1.78 5.54 1.11 1.10 2.05 1.36 2.46 105 1.20 1.22 22.43 2.48 2.49 8.74 1.33 1.33 2.52 1.36 1.56

πAY(ξ2) is significatively larger than π(ξ2) for d. 5 only.As in Table 4, κ(A15)< κ(A25),

but the figures are now much larger, indicating that the algorithms have difficulties with providing small 0-core sets. As a consequence, here A105 (respectively, A205) does not stop earlier than A15 (respectively, A25), and the results for A105 and A205 are omitted

in Tables 8 and 9 since they are identical to those for A15 and A25. The number of

iterations for given d and n in Table 8 is significantly larger than in Tables 2 and 5, with N(A15)>N(A25) for d ≤ 10 and N(A25) slightly larger than N(A15) for d ≥ 30. The

number of iterations of A2∗5 is now very large. Table 9 shows that QP0 is generally the

fastest among exact methods for d ≤ 5 and is only slightly slower than QP for larger d. On the other hand, A205-QP is much slower than QP for d ≥ 10 and A2∗5 is by far the slowest exact method is all cases considered.

Finally, one may notice that, for given d and n, the computational times for QP (and thus of QP0) are quasi identical in Tables 3 and 6 and are only increased by a small factor

in Table 9, enhancing the interest of using QP with elimination of inessential points to solve smallest enclosing ball problems with moderate d.

6. Conclusions

An inequality has been derived that permits to remove inessential (interior) points during the computation of the smallest enclosing ball of a set of points. The inequality is, in some sense, the best possible, and is given by a simple expression depending on the mean and the (trace of the) variance matrix of a probability measure placed on the set of points. Any probability measure gives such an an inequality. Algorithms for the

(21)

Table 7. Xiuniform inBd(0, 1): proportion π (in %) of points not eliminated and number κ(M) of remaining points after applying method M — averaged values over 100 repetitions, rounded to the nearest integer.

d n π∗ π(ξu) π(ξ2) πAY(ξ2) κ(A15) κ(A25) κ(A2∗5)

2 103 86.60 87.04 36.00 52.84 67 87 9 104 86.60 86.78 24.82 36.35 644 978 35 105 86.60 86.66 17.45 25.34 6384 16627 235 3 103 91.06 91.61 63.29 83.69 100 112 9 104 91.06 91.26 49.12 67.89 948 1061 44 105 91.06 91.13 40.78 57.17 9399 17555 339 4 103 93.52 94.21 80.24 95.49 127 140 10 104 93.52 93.74 70.89 89.16 1229 1375 54 105 93.52 93.60 62.08 81.02 12245 15449 445 5 103 95.06 95.64 89.42 98.78 157 164 11 104 95.06 95.29 85.33 97.54 1522 1667 64 105 95.06 95.15 77.66 93.34 15173 17407 553 10 103 98.21 98.70 99.79 100.00 287 297 19 104 98.21 98.39 99.61 100.00 2795 2920 113 105 98.21 98.28 99.19 100.00 27925 29054 1080 20 103 99.54 99.74 100.00 100.00 488 499 39 104 99.54 99.64 100.00 100.00 4799 4859 221 105 99.54 99.58 100.00 100.00 47892 48562 2084 30 103 99.84 99.93 100.00 100.00 632 638 56 104 99.84 99.89 100.00 100.00 6237 6276 329 105 99.84 99.86 100.00 100.00 62624 62577 3053 40 103 99.93 99.98 100.00 100.00 738 740 72 104 99.93 99.96 100.00 100.00 7286 7284 433

solution of the dual problem construct sequences of probability measures (defined by the Lagrange coefficients), which can thus straightforwardly be used to progressively eliminate inessential points. A two-point measure ξ2, already proposed in the literature

to efficiently initialise such dual algorithms [28], has been shown to efficiently directly remove a significant proportion of points in various situations with reasonably small dimension d. Several numerical experiments have indicated that this simple pre-filtering of the input set is clearly beneficial to a QP solver when enough inessential points are removed (d is small enough) and that the extra cost (slow-down factor) due to pre-filtering is marginal otherwise (for large d). Other methods, like those in [9, 26]4 might also benefit from the input-size reduction offered by this pre-filtering. Notice, finally, that these methods rely on the computation of a sequence of smallest enclosing balls for sets of d + 1 points, from which a sequence of probability measures, and thus of eliminating inequalities, could easily be deduced; see [9, Sect. 3].

Acknowledgments

The author thanks the two referees for their comments that helped to improve the presen-tation of the paper. He also thanks the referee who pointed out the existence of reference [2].

4See also the implementation in http://doc.cgal.org/latest/Bounding_volumes/classCGAL_1_1Min_sphere__d. html

Figure

Figure 1. Determination of the lower bound (14) in the proof of Theorem 3.2: admissible set for (u, v) (coloured) and optimum point minimising u − v (dot).
Figure 3. b[φ(ξ u ), γ(ξ u )] (stars, left), b AY [φ(ξ u ), γ(ξ u )] (triangles, left) and corresponding proportions α(ξ u ) of eliminated points (right, log-scale) as functions of d.
Table 1. Sobol’ sequence in [0, 1] d : proportion π (in %) of points not eliminated and number κ(M) of remaining points after applying method M.
Table 3. Sobol’ sequence in [0,1] d : computational time t(QP) (in s) and ratios ρ(M)=t(M)/t(QP) — averaged over 10 repetitions
+6

Références

Documents relatifs

Our signature method is a variant of Balinski's [3] compétitive (dual) simplex algorithm. It constructs rooted trees that have precisely the same primai structure as Balinski's

Nevertheless, a O(log 3 n) approximation ratio—with n the length of the sequence—can be achieved by a simple algorithmic scheme based on an approximation to the shortest

Schoenberg [S1, S2] established that this sequence has a continuous and strictly increasing asymptotic distribution function (basic properties of distribution functions can be found

It is well known that for the Neumann problem in linear elasticity there are Rayleigh surface waves moving along the boundary (see e.g... can expect résonances near

Namely, if {L,} is an interpolating sequences of hyperp lanes, then {aj\ is an interpolating sequence of points, so they must satisfy Varopoulos's necessary condition (cf.. the

We are concerned with k-medoids clustering and propose a quadratic unconstrained binary optimization (QUBO) formulation of the problem of identifying k medoids among n data

The generated grammar is generalised man- ually and automatically, looking at patterns on the final grammar (see Sect. The thesis of Nevill-Manning [171] dates from 1997 and is

The technique used here will allow us using geomet- rical interpretation of quantities of the type of d 0 and d 1 to derive stability results concerning this type of behavior for