• Aucun résultat trouvé

Nguyen Kim Thang

N/A
N/A
Protected

Academic year: 2022

Partager "Nguyen Kim Thang"

Copied!
7
0
0

Texte intégral

(1)

Problem

2

Nguyen Kim Thang

3

IBISC, Univ Evry, University Paris Saclay

4

Evry, France

5

[email protected]

6

https://orcid.org/0000-0002-6085-9453

7

Abstract

8

In the subspace approximation problem, given m points in Rn and an integer kn, the goal

9

is to find ak-dimension subspace ofRn that minimizes the`p-norm of the Euclidean distances

10

to the given points. This problem generalizes several subspace approximation problems and has

11

applications from statistics, machine learning, signal processing to biology. Deshpande et al. [4]

12

gave a randomizedO(

p)-approximation and this bound is proved to be tight assuming NP6= P

13

by Guruswami et al. [7]. It is an intriguing question of determining the performance guarantee

14

of deterministic algorithms for the problem. In this paper, we present a simple deterministic

15

O(

p)-approximation algorithm with also a simple analysis. That definitely settles the status

16

of the problem in term of approximation up to a constant factor. Besides, the simplicity of the

17

algorithm makes it practically appealing.

18

2012 ACM Subject Classification Theory of computation Approximation algorithms analysis

19

Keywords and phrases Approximation Algorithms, Subspace Approximation

20

Digital Object Identifier 10.4230/LIPIcs.SWAT.2018.30

21

Funding Research supported by the ANR project OATA noANR-15-CE40-0015-01.

22

1 Introduction

23

Massive data in high dimension emerge naturally in many domains from machine learning to

24

biology. It has been observed that although data lie in high-dimensional spaces, in practice

25

they have low intrinsic dimension. Dimension-reduction algorithms are essential in many

26

domains such as image processing, personalized medicine, etc. In this paper, we consider

27

the following subspace approximation problem in the context of capturing the underlying

28

low-dimensional structures of given data.

29

Subspace Problem. Given pointsa1, . . . ,am∈Rn and integersp≥1 and 0≤kn. Find ak-dimensional linear subspaceW that minimizes the`p-norms of Euclidean distances of these points to W, i.e.,

min

W:dim(W)=k

m X

i=1

d(ai, W)p 1/p

The Subspace Problem, which is introduced by Deshpande et al. [4], is a generalization of

30

several sub-space approximation problems which have been widely studied. For example,

31

the well-knownLeast Square Fit Problemis a particular case. In the latter, given a matrix

32

A∈Rm×n and 0≤kn, find a matrixB∈Rm×n of rank at most kthat minimizes the

33

Frobenius norm of the differencekA−BkF := P

i,j(AijBij)21/2. Taking the rows ofA

34

to bea1, . . . ,amandp= 2,Subspace Problemreduces toLeast Square Fit Problem. Another

35

© Nguyen Kim Thang;

licensed under Creative Commons License CC-BY

16th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT 2018).

Editor: David Eppstein; Article No. 30; pp. 30:1–30:7

Leibniz International Proceedings in Informatics

Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany

(2)

special case ofSubspace ProblemisRadii Problem. In the latter, given pointsa1, . . . ,am∈Rn,

36

theirouter (n−k) radius is the minimum, over allk-dimensional linear subspaces, of the

37

maximum Euclidean distances of these points to the subspace. This problem is equivalent

38

toSubspace Problemwithp=∞. Moreover,Subspace Problem is related to other problems

39

such as theLp-Grothendieck Problem[8] and`p-Regression Problem[1, 5, 3]. We refer the

40

reader to [4] for more details about the connection between these problems.

41

Deshpande et al. [4] introduced theSubspace Problem and gave a randomizedO(p)-

42

approximation algorithm. In their approach, they consider a convex relaxation that optimizes

43

over positive semidefinite matrices and the rank constraint is replaced by a trace constraint.

44

Subsequently, the solution X of the convex relaxation is rounded to a matrix of suitable

45

rank. Intuitively, the authors divide singular vectors of the solutionX into several bins and

46

constructs one vector for each bin by taking a Bernoulli random linear combination of vectors

47

within each bin. The analysis is carried out by powerful techniques coupled with properties

48

of thepth-moment of sums of Bernoulli random variables. Besides, Deshpande et al. [4]

49

proved that theSubspace Problemis hard to approximate within a factor Ω(√

p) assuming

50

the Unique Games Conjecture (UGC). Later on, bypassing the need for UGC, Guruswami et

51

al. [7] showed that the problem is indeed NP-hard to approximate within a factor Ω(√ p).

52

Our Contribution. In this paper, we present a deterministic greedy algorithm with the same

53

O(

p)-approximation guarantee. Informally, at any step, the algorithm greedily extends the

54

subspace in order to minimizes the marginal cost of the objective functions. The algorithm

55

(and also the analysis) is extremely simple, which makes it practically appealing. Besides,

56

our algorithm is deterministic whereas the one in [4] is randomized. The analysis is based on

57

a smooth inequality (Lemma 2), which has been originally used in the context of algorithmic

58

game theory in order to bound the quality of equilibrium (price of anarchy) in scheduling

59

games [2]. This allows us to give a deterministic algorithm instead of the randomized one [4]

60

which crucially relies on concentration inequalities in functional analysis in order to bound

61

the moments of sums of Bernoulli random variables. Our result definitely settles the status

62

of the problem in term of approximation up to a constant factor.

63

Related works. The most closely related to our paper is [4] where the results have been

64

summarized earlier. For theLeast Square Fit Problem, the optimal subspace is spanned by the

65

topkright singular vector ofAand that can be computed in timeO(min{n2m, nm2}) [6]. For

66

theRadii Problem,O(

logm)-approximation withk=n−1 can be implied from the works of

67

Nesterov [10] and Nemirovski et al. [9]. Later on, Varadarajan et al. [11] gave anO(√ logm)-

68

approximation algorithm for arbitraryk. Note that it is well-known that`-norm can be

69

approximated by`logm-norm up to a constant factor. Hence,O(

logm)-approximation can

70

be deduced from [4] (so our work) by choosingp= logm.

71

2 Greedy Algorithm

72

As in [4], we use a formulation of theSubspace Problemin terms of the orthogonal complement

73

of the subspaceW. Specifically, letz1, . . . ,zn−k be an orthonormal basis for the orthogonal

74

complement. Let Z ∈ R(n−k) be the matrix with the jth column vector zj. Then

75

d(ai, W) =kaTiZk2. Hence, the problem is to find an orthonormal basisz1, . . . ,zn−k of a

76

(3)

(n−k)-dim vector spaceV so that the corresponding matrixZ minimizes

77

m

X

i=1

kaTi Zkp2=

m

X

i=1

n−k X

`=1

aTiz`2p/2

78

=

m

X

i=1 n−k

X

j=1

" j X

`=1

aTi z`

2p/2

j−1

X

`=1

aTi z`

2p/2# ,

79 80

where conventionally the sum with no term equals 0.

81

Algorithm. Initially, the subspace U0 = ∅. For 1≤ jnk, choose a vectoruj 6=0

82

orthonormal to the subspace Uj−1 spanned by u1, . . . ,uj−1 such that it minimizes the

83

marginal increase of the objective, i.e.,

84

uj∈arg min

z⊥Uj−1

m

X

i=1

"

(aTiz)2+

j−1

X

`=1

aTiu`2p/2

j−1

X

`=1

aTiu`2p/2#

85 86

In fact, vectorujcan be computed by solving a convex program. Specifically, let{e1, . . . ,en−j+1}

87

be an arbitrary orthogonal basis of the vector spaceUj−1 (which is orthogonal toUj−1).

88

Computinguj is equivalent to computing the coefficientsb1, . . . , bn−j+1 in the decomposition

89

ofuj in the basis{e1, . . . ,en−j+1}. The convex program is

90

b1,...,bminn−j+1

m

X

i=1

aTi ·

n−j+1

X

h=1

bheh

2

+

j−1

X

`=1

aTiu`

2p/2

j−1

X

`=1

aTiu`

2p/2

91

n−j+1

X

h=1

b2h= 1

92

b1, . . . , bn−j+1∈R

9394

Note that in the convex program, variables areb1, . . . , bn−j+1 (the vectorsu`’s, eh’s have

95

been already determined).

96

Analysis. LetV be an optimaln×(n−k) matrix andV be the corresponding vector space spanned by column vectors ofV. In the remaining, we will show that

m X

i=1

kaTiUkp2 1/p

O(γpm

X

i=1

kaTiVkp2 1/p

First, we recall the following standard lemma.

97

ILemma 1.For any(n−k)-dim subspaceV, there exists an orthonormal basis{v1, . . . ,vn−k}

98

ofV such thatvj is orthogonal to Uj−1 for1≤jnk.

99

Proof. We construct an orthogonal basis{v1, . . . ,vn−k}ofV by induction. The orthonormal

100

basis is obtained by standard normalizing procedure. Forj= 1, any arbitrary vectorv1V

101

is perpendicular toU0. Assume that vectorsv1, . . . ,vj forj <(n−k) have been constructed

102

so that they satisfy the lemma. Since the subspaceV has dimension (n−k) which is strictly

103

larger than j, there exits a vectorwj+1V which is independent tov1, . . . ,vj. Define

104

vectorvj+1:=wj+1−Pruj(wj+1) wherePruj(wj+1) is the projection of vectorwj+1 onto

105

the subspaceUj. Sovj+1 is orthogonal touj and is independent tov1, . . . ,vj. J

106

S WAT 2 0 1 8

(4)

ILemma 2. For any given vectora, for arbitrary vectorsu` and v` with 1≤`nk, it

107

holds that

108

n−k

X

j=1

"

(aTvj)2+

j−1

X

`=1

aTu`2p/2

j−1

X

`=1

aTu`2p/2#

109

µ n−k

X

`=1

aTu`2p/2

+λ n−k

X

`=1

aTv`2p/2

110 111

whereλ=O (α·p2)p21

for some constant αandµ= p/p/221.

112

Proof. Denotebj = (aTvj)2 andcj = (aTuj)2 for 1≤jnk. The lemma inequality

113

reads

114

n−k

X

j=1

"

bj+

j−1

X

`=1

c`

p/2

j−1

X

`=1

c`

p/2#

µ n−k

X

`=1

c`

p/2

+λ n−k

X

`=1

b`

p/2

115 116

The inequality holds for λ=O (α·p2)p21

for some constantαandµ=p/p/221, which has

117

been proved in [2] in the context of algorithmic game theory. For completeness, we put the

118

proof of the above inequality in the appendix (Lemma 5). J

119

ITheorem 3. The greedy algorithm isO(

p)-approximation.

120

Proof. Recall thatU be the solution of the algorithm where thejth column vector isuj for

121

1≤jnk. Letv1, . . . ,vn−k be an orthonormal basis ofV that satisfies Lemma 1. We

122

have

123

m

X

i=1

kaTiUkp2=

m

X

i=1 n−k

X

j=1

"

j X

`=1

aTiu`2p/2

j−1

X

`=1

aTiu`2p/2#

124

m

X

i=1 n−k

X

j=1

"

(aTi vj)2+

j−1

X

`=1

aTi u`2p/2

j−1

X

`=1

aTi u`2p/2#

125

m

X

i=1

"

µ n−k

X

`=1

aTiu`2p/2

+λ n−k

X

`=1

aTi v`2p/2#

126

=µ

m

X

i=1

kaTiUkp2+λ

m

X

i=1

kaTiVkp2

127 128

The first inequality is due to the choice of the algorithm at any stepj(note thatvj⊥Uj−1 so

129

vjis a candidate at stepj). The second inequality holds by Lemma 2 whereλ=O (α·p2)p21

130

for some constantαandµ=p/p/221. Rearranging the terms and taking the pth-root, we get

131

m

X

i=1

kaTiUkp2

!1/p

O(p)·

m

X

i=1

kaTiVkp2

!1/p

132 133

J

134

References

135

1 Kenneth L Clarkson. Subgradient and sampling algorithms for`1regression. InProc. 16th

136

Symposium on Discrete algorithms, pages 257–266, 2005.

137

(5)

2 Johanne Cohen, Christoph Dürr, and Nguyen Kim Thang. Smooth inequalities and equilib-

138

rium inefficiency in scheduling games. InInternet and Network Economics, pages 350–363,

139

2012.

140

3 Anirban Dasgupta, Petros Drineas, Boulos Harb, Ravi Kumar, and Michael W Ma-

141

honey. Sampling algorithms and coresets for`p regression. SIAM Journal on Computing,

142

38(5):2060–2078, 2009.

143

4 Amit Deshpande, Madhur Tulsiani, and Nisheeth K Vishnoi. Algorithms and hardness for

144

subspace approximation. InProc. 22nd Symposium on Discrete Algorithms, pages 482–496,

145

2011.

146

5 Petros Drineas, Michael W Mahoney, and S Muthukrishnan. Sampling algorithms for`2

147

regression and applications. InProc. 17th Symposium on Discrete algorithm, pages 1127–

148

1136, 2006.

149

6 Gene H Golub and Charles F Van Loan. Matrix computations. Johns Hopkins University

150

Press, 2012.

151

7 Venkatesan Guruswami, Prasad Raghavendra, Rishi Saket, and Yi Wu. Bypassing UGC

152

from some optimal geometric inapproximability results. ACM Transactions on Algorithms,

153

12(1):6, 2016.

154

8 Guy Kindler, Assaf Naor, and Gideon Schechtman. The UGC hardness threshold of the

155

Lpgrothendieck problem. Mathematics of Operations Research, 35(2):267–283, 2010.

156

9 Arkadi Nemirovski, Cornelis Roos, and Tamás Terlaky. On maximization of quadratic form

157

over intersection of ellipsoids with common center. Mathematical Programming, 86(3):463–

158

473, 1999.

159

10 Yuri Nesterov. Global quadratic optimization via conic relaxation. Université Catholique

160

de Louvain. Center for Operations Research and Econometrics [CORE], 1998.

161

11 Kasturi Varadarajan, Srinivasan Venkatesh, Yinyu Ye, and Jiawei Zhang. Approximating

162

the radii of point sets. SIAM Journal on Computing, 36(6):1764–1776, 2007.

163

S WAT 2 0 1 8

(6)

Appendix: Technical Lemmas

164

In this section, we show technical lemmas. The following lemma has been proved in [2]. We

165

give it here for completeness.

166

ILemma 4 ([2]). Letk be a positive integer. Let 0< a(k)≤1be a function on k. Then, for anyx, y >0, it holds that

y(x+y)kk

k+ 1a(k)xk+1+b(k)yk+1 whereαis some constant and

b(k) =













Θ αk· k

logka(k) k−1!

if limk→∞(k−1)a(k) =∞, (1a) Θ αk·kk−1

if (k−1)a(k)are bounded∀k, (1b) Θ

αk· 1 ka(k)k

if limk→∞(k−1)a(k) = 0. (1c) Proof. Letf(z) := k+1k a(k)zk+1−(1 +z)k+b(k). To show the claim, it is equivalent to

167

prove thatf(z)≥0 for allz >0.

168

We havef0(z) =ka(k)zkk(1 +z)k−1. We claim that the equationf0(z) = 0 has an unique positive rootz0. Consider the equationf0(z) = 0 forz >0. It is equivalent to

1 z + 1

k

·1 z =a(k)

The left-hand side is a strictly decreasing function and the limits whenztends to 0 and∞

169

are∞and 0, respectively. Asa(k) is a positive constant, there exists an unique rootz0>0.

170

Observe that functionfis decreasing in (0, z0) and increasing in (z0,+∞), sof(z)≥f(z0)

171

for allz >0. Hence, by choosing

172

b(k) =

k

k+ 1a(k)z0k+1−(1 +z0)k

= (1 +z0)k−1 1 + z0

k+ 1

(2)

173

it follows thatf(z)≥0∀z >0.

174

We study the positive rootz0 of equation

175

a(k)zk−(1 +z)k−1= 0 (3)

176

Note that f0(1) = k(a(k)−2k−1) < 0 since 0 < a(k) ≤1. Thus, z0 > 1. For the sake

177

of simplicity, we define the function g(k) such that z0 = k−g(k1) where 0 < g(k) < k−1.

178

Equation (3) is equivalent to

179

1 + g(k) k−1

k−1

g(k) = (k−1)a(k)

180

Note thatew/2<1 +w < ew forw∈(0,1). For w:= gk−(k1), we obtain the following upper

181

and lower bounds for the term (k−1)a(k):

182

eg(k)/2g(k)<(k−1)a(k)< eg(k)g(k) (4)

183

(7)

Recall the definition of Lambert W function. For each y ∈R+, W(y) is defined to be

184

solution of the equationxex=y. Note that,xexis increasing with respect to x, henceW(·)

185

is increasing.

186

By definition of the Lambert W function and Equation (4), we get that

187

W((k−1)a(k))< g(k)<2W

(k−1)a(k) 2

(5)

188

First, consider the case where limk→∞(k−1)a(k) =∞. The asymptotic sequence for

189

W(x) asx→+∞ is the following: W(x) = lnx−ln lnx+ ln lnlnxx+O

ln lnx lnx

2

. So, for

190

large enough k,W((k−1)a(k)) = Θ(log((k−1)a(k))). Sincez0= k−g(k1), from Equation (5),

191

we get z0 = Θ

k log(ka(k))

. Therefore, by (2) we haveb(k) = Θ

αk·

k logka(k)

k−1 for

192

some constantα.

193

Second, consider the case where (k−1)a(k) is bounded by some constants. So by (5), we

194

haveg(k) = Θ(1). Thereforez0= Θ(k) which again impliesb(k) = Θ αk·kk−1

for some

195

constantα.

196

Third, we consider the case where limk→∞(k−1)a(k) = 0. We focus on the Taylor series W0 ofW around 0. It can be found using the Lagrange inversion and is given by

W0(x) =

X

i=1

(−i)i−1

i! xi=xx2+O(1)x3.

Thus, forklarge enoughg(k) = Θ((k−1)a(k)). Hence,z0= Θ(1/a(k)). Once again that

197

impliesb(k) = Θ

αk· ka(1k)k

for some constantα. J

198

ILemma 5.For any sequences of non-negative real numbers{a1, a2, . . . , an}and{b1, b2, . . . , bn}

199

and for any polynomial g of degreekwith non-negative coefficients, it holds that

200

n

X

i=1

g

bi+

i−1

X

j=1

aj

g i−1

X

j=1

aj

≤λ(k)·g n

X

i=1

bi

+µ(k)·g n

X

i=1

ai

201 202

where µ(k) = k−k1 and λ(k) = Θ kk−1

. The same inequality holds for µ(k) = kk−ln1k and

203

λ(k) = Θ (α·klnk)k−1

for some constant α.

204

Proof. Letg(z) =g0zk+g1zk−1+·+gk withgt≥0∀t. The lemma holds since it holds for

205

every ztfor 0≤tk. Specifically,

206

n

X

i=1

g

bi+

i−1

X

j=1

aj

g i−1

X

j=1

aj

=

k

X

t=1

gk−t·

n

X

i=1

bi+

i−1

X

j=1

aj

t

i−1

X

j=1

aj

t

207

k

X

t=1

gk−t·

t·bi·

bi+

i−1

X

j=1

aj t−1

≤

k

X

t=1

gk−t·

"

λ(t) n

X

i=1

bi t

+µ(t) n

X

i=1

ai t#

208

λ(k)·g n

X

i=1

bi

+µ(k)·g n

X

i=1

ai

209 210

The first inequality follows the convex inequality (x+y)k+1xk+1≤(k+ 1)y(x+y)k. The

211

second inequality follows Lemma 4 (Case (1b) and a(k) = 1/(k+ 1)). The last inequality

212

holds sinceµ(t)µ(k) andλ(t)λ(k) fortk. J

213

S WAT 2 0 1 8

Références

Documents relatifs

Before continuing with the construction, we will use this section to prove the following theorem which we will need many times in Section 3, where the actual

Keywords: mathematical theory and formulation of inverse and optimization problems, identifi- cation problems, regularization techniques; reconstruction techniques, robust

A SR problem is covered by the linear direct model (30) with a system matrix A gathering the warping, blurring and down-sampling operators. In fact, the main concern of this section

With a view towards elimination by 2020, Belize’s national malaria programme reoriented its activities in 2015 to enhance surveillance through a greater focus on locating

We apply approximation results of certain functions of several variables by means of sums of tensor products of positive polynomials on the real line, in each separate variable,

terminate Stieltjes moment problems associated with birth and death processes, Methods and Applications of Analysis, 1 (1994), pp. 2014 Mesures canoniques dans le

We show that the solution of the two-dimensional Dirichlet problem set in a plane domain is the limit of the solutions of similar problems set on a sequence of one-dimensional

Evolution problems, implicit variational inequalities, approximation and ex- istence results, subspace correction, domain decomposition, Schwarz method, multilevel