• Aucun résultat trouvé

A necessary and sufficient condition for exact recovery by l1 minimization.

N/A
N/A
Protected

Academic year: 2021

Partager "A necessary and sufficient condition for exact recovery by l1 minimization."

Copied!
5
0
0

Texte intégral

(1)

HAL Id: hal-00164738

https://hal.archives-ouvertes.fr/hal-00164738v2

Preprint submitted on 25 Nov 2011

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A necessary and sufficient condition for exact recovery by l1 minimization.

Charles Dossal

To cite this version:

Charles Dossal. A necessary and sufficient condition for exact recovery by l1 minimization.. 2011.

�hal-00164738v2�

(2)

A necessary and sufficient condition for exact sparse recovery by ℓ

1 minimization

Charles Dossal

a

,

aIMB, Université Bordeaux 1, 351, cours de la Libération, F-33405 Talence cedex (FRANCE)

Abstract

In this paper, a new sharp sufficient condition for exact sparse recovery byℓ1-penalized minimization from linear measurements is proposed. The main contribution of this paper is to show that, for most matrices, this condition is also necessary. Moreover, when theℓ1 minimizer is unique, we investigate its sensitivity to the measurements and we establish that the application associating the measurements to this minimizer is Lipschitz-continuous.

Résumé

Une condition nécessaire et suffisante d’identifiabilité parcimonieuse par minimisationℓ1. Dans cet article, une nouvelle condition suffisante pour l’identifiabilité parcimonieuse par minimisationℓ1pénalisée à partir de mesures linéaires est proposée. La contribution majeure de ce travail est de prouver que pour la plupart des matrices, cette condition est aussi nécessaire. Par ailleurs, lorsque le minimiseur du problème ℓ1 est unique, sa sensibilité aux mesures est étudiée est il montré que l’application qui envoie les mesures sur ce minimiseur est Lipschitz-continue.

1. Introduction

Let x0 ∈ RN be a vector and A be a real matrix with n rows and N columns. Let y0 be n linear measurements of x0, i.e.y0 =Ax0 ∈Rn, where typicallyn < N. In this paper, we propose a necessary and sufficient condition ensuring thatx0 can be recovered fromy0 by solving the following optimization problemP1(y0)

x∈minRNkxk1 such that y=Ax. P1(y)

There is of course a huge literature on the subject, and covering it fairly is beyond the scope of this paper.

We restrict our overview to those works pertaining to ours. For instance, our new condition can be seen as an extension of [3] and [?]. Indeed, in [3], the following optimization problem (so-called Lasso problem)

x∈minRN

1

2ky−Axk22+γkxk1. P1(y, γ)

is studied. It is proved that anyx0 belonging to the setF is the unique solution ofP1(Ax0), where F=

xsuch that rank(AI) =|I|and∀j /∈I, |haj, AI(AtIAI)−1sign (xI)i|<1 ,

whereIis the support ofx,AI the active matrix associated tox, whose columns are those ofAindexed by I,xI are the non-zero components ofxandajis the column ofAindexed byj. In the sequel,Span(ai)i∈I

will be denoted VI. The sufficient condition developed in [3] plays a pivotal role in several papers which investigate support recovery in presence of noise by solving P1(y, γ), e.g. [1,4,2].

In [?], the Exact Recovery Condition ERC is defined. this condition does not depend on the sign but only on the support ofx0 and provides results about the recovery ofx0 and stability to noise.

The goal of this paper is to show that the set of vectors that can be recovered by ℓ1 minimization is exactly, for most matrices,F the closure ofF. This result highlights the fact that

maxj /∈I |haj, AI(AItAI)−1sign (xI)i|and

AI(AtIAI)−1sign (xI) 2

are good indicators of the identifiablility, and brings arguments justifying the success of algorithms de- veloped in [?] to find very sparse but non-identifiable vectors.

Email address:charles.dossal@math.u-bordeaux1.fr(Charles Dossal).

(3)

Before proceeding, let us fix some terminology and definitions. A vectorx0 is said identifiable if and only if it is the unique solution ofP1(Ax0).

Definition1.1 Let (xi)i6N be N points of Rn. These points(xi)i6N are said in general position (GP) if all affine subspaces ofRn which dimensionk < ncontain at most k+ 1pointsxi. A matrixA satisfies condition(GP)if for all sign vectorS∈ {−1,1}N, points(S[i]ai)i6N are in general position.

It can be noticed that for any matrix A, the matrix A+E satisfies condition (GP)with probability 1 ifE is any random perturbation with a an absolutely continuous density with respect to the Lebesgue measure: with this definition most matrices satisfy condition(GP).

2. Contributions

The contributions of this paper are summarized as follows.

Theorem 2.1 Ifx0∈ F, thenx0 is identifiable.

Theorem 2.2

(i) Ifx0is identifiable, and if for allyin a neighborhood ofy0=Ax0P1(y)has only one solution, then x0∈ F.

(ii) If Asatisfies(GP), for ally∈Im (A),P1(y)has a unique solution andxis identifiable if and only if x∈ F.

Theorem 2.3 If for all y ∈ Im (A) the solution of P1(y) is unique, the application φ associating y to this solution is Lipschitz.

3. Preliminary Lemmas

The two following Lemmas can be found in [3].

Lemma 3.1 A vectorx is a solution ofP1(y, γ)if and only if

AtI(y−Ax) =γsign (xI) and∀j /∈I|haj, y−Axi|61 where the support ofx is denoted byI.

Lemma 3.2 If for a vectorx the matrix AI satisfies rank(AI) =|I| and

AtI(y−Ax) =γsign (xI) and∀j /∈I,|haj, AI(AtIAI)−1sign (xI)i|<1 thenx is the unique minimizer of P1(y, γ).

Lemma 3.3 Ifx0 is the unique solution ofP1(Ax0)then rank(AI) =|I|.

Proof: If there existsh∈Ker(A)that is supported onI=I(x0), one has for allt∈R,A(x0+th) = Ax0. Consequently the applicationt7→

x0+th

1is locally affine in a neighborhood of 0. It follows that there existst6= 0such that

x0+th 16

x0 1.

Lemma 3.4 For ally∈Im (A) there exists a solutionxtoP1(y)such that rank(I) =|I|.

Proof: Letx0be a solution ofP1(y). If rank(I)<|I|, there existsh∈Ker(A)that is supported on I. For a suitablet,I(x0+th)(I(x0)andx0+this a solution ofP1(y).

Lemma 3.5 Ifx0∈ F, thenx0 is the unique solution ofP1(Ax0)

Proof: Lemma 3.2 shows that x0 is the unique solution of P1(AI(x0I+γ(AtIAI)−1sign x0I ), γ) if γ >0 is small enough.

Lemma 3.6 Let y∈Im (A)and(γn)n∈N a sequence of positive real numbers tending vers 0. If(xn)n∈N

are some solutions of P1(y, γn)sucht that limn→∞xn=x0 one gets thatx0 is a solution ofP1(y).

4. Proofs

4.1. Proof of theorem 2.1

From Lemma 3.5 one has that vectors inF are identifiable. SinceAI(AtIAI)−1sign (xI)only depends on the sign and the support ofx,F is a union of cones of various dimensions. It follows that the closure F ofF is exactly equal to the set of vectorsx0 that can extented into a vector ofF :

2

(4)

F=

x0 such that∃x1such thatI(x0)∩I(x1) =∅ andx0+x1=x2∈ F

Let us now suppose thatx0∈ F, and that there exists x1 such thatI(x0)∩I(x1) =∅ andx2=x0+x1 belongs toF.

Choosex3∈RN such thatAx0=Ax3 and define x4:=x3+x1. We haveAx4=Ax2 and sincex2∈ F, x2is the unique solution of P1(Ax2)which implies thatkx2k1<kx4k1. It follows that

x2 1=

x0 1+

x1 1<

x4 16

x3 1+

x1

1, (1) which implies

x0 1<

x3

1. That is,x0is the unique solution of P1(Ax0).

4.2. Proof of theorem 2.2

Proof: Lety∈Im(A)and(γn)n∈Na sequence of positive real numbers tending to 0. From Lemma 3.4 we can always choose a solution x(γn) ofP1(y, γ)such that the associated active matrix AIn has a full rank. Since the number of possible supports and signs is finite, up to an extraction of a subsequence, we can suppose that allx(γn)share the same support Iand the same signS. From Lemma 3.1, we know thatx(γn)satisfies

x(γn)I =x0I−γn(AtIAI)−1S and∀j /∈I |haj, y−Ax(γn)i|6γ (2) where x0 is the vector supported on I such that x0I = (AItAI)−1AtIy. Using Lemma 3.6, x0 is a solu- tion of P1(y). From (2) we deduce that I(x0) ⊂ I for γn small enough. It follows thaty−Ax(γn) = γAI(AtIAI)−1S =γdI,S and thus∀j /∈I,|haj, dI,Si|61

Let us defineJ to be the set of all indices such that|haj, dI,Si|= 1. One can first notice thatI⊂J. We now show that

(i) either(aj)j∈J are linearly dependent and one can build vectorsx1close tox0such thatP1(x1)has several solutions, in which case condition(GP)cannot be satisfied;

(ii) or(aj)j∈J are linearly independent andx(γn)∈ F andx0∈ F andx0 is identifiable.

(i) Suppose that (aj)j∈J are linearly dependent and let ε > 0. Define K ( J such that I ⊂ K and (ak)k∈K is a basis ofVJ = Span(aj)j∈J andj0∈J∩Kc. DefineSK∈R|K|bySK =AtKdI,S. Then, SK is a sign vector andx1 the vector supported onKdefined by

x1K =x0K+ ε kSKk2SK

Now chooseε >0small enough to ensure thatsign x1I(x0)

= sign x0I(x0)

. From definition ofx1, it follows that

x0−x1

2 =ε. For γ small enough the support and the sign ofx1 and the vector x1(γ)supported inK and defined byx1(γ)K=x1K−γ(AtKAK)−1SK are identical.

We shall prove thatx1(γ)is a solution ofP1(Ax1, γ)using Lemma 3.1. Indeed AtK(Ax1−Ax1(γ)) =γSK

The vector S = AtIdI,S is the common sign of vectors (x(γn))n∈N. Since limn→∞x(γn) = x0 it follows that for alli∈I(x0)⊂I,sign x0[i]

=S[i]. Since for alli∈I, S[i] =hai, dI,Si=SK[i]it follows thatSK= sign x1K

. Moreoversign x1K

= sign x1(γ)K

which yields that AtK(Ax1−Ax1(γ)) =γsign x1(γ)K

SincedI,S∈VI ⊂VK, one hasdI,S =AK(AtKAK)−1AtKdI,S=AK(AKt AK)−1SK. Ifj /∈K,

|haj, Ax1−Ax1(γ)i|=γ|haj, AK(AKt AK)−1SKi|=γ|haj, dI,Si|< γ,

and by Lemma 3.1 it follows thatx1(γ)is a solution ofP1(Ax1, γ). Now Lemma 3.6 shows thatx1is a solution ofP1(Ax1). Choose someh∈Ker(A)that is supported onK∪ {j0}and such thath[j0] = S[j0] =atj0dI,S. DenotehK the restriction ofhon indicesK. Since0 =Ah=S[j0]aj0+AKhK and SK =AtKdI,S one has

hsign x1K

, hKi=SKt hk=dtI,SAKhK =−S[j0]dtI,Saj0 =−1 Moreover, for allt∈R,A(x1+th) =Ax1 and for small non negativet

x1+th 1=

x1

1+t+thsign x1

, hKi= x1

1, which shows thatx1is not the unique minimizer ofP1(Ax1).

Notice that for each l∈K∪ {j0}, S[l]al∈VK andS[l]al belongs to the affine hyperplan HdI,S =

(5)

{u, such thathu, dI,Si = 1}. Hence at least |K|+ 1 points S[l]al belong to an affine subspace VK∩ HdI,S which dimension equals to|K| −1. Therefore,(GP)is not satisfied.

(ii) Let us now suppose that the(aj)j∈J are linearly independent. DefineSJ =AtJdI,S,∀i∈I, S[i] = SJ[i]. Let x1 be the vector supported on J such that for all i ∈ I(x0), x1[i] = x0[i] and for all j ∈J ∩I(x0)c, x1[j] = SJ[j]. One has sign x1J

=SJ and since dI,S ∈ VI ⊂VJ, one hasdI,S = AJ(AtJAJ)−1AtJdI,S=AJ(AtJAJ)−1SJ. It follows that for alll /∈J,

|hal, AJ(AtJAJ)−1SJi|=|hal, dI,Si|<1 implying thatx1∈ F and hencex0∈ F.

4.3. Proof of theorem 2.3

The proof relies on the following lemma

Lemma 4.1 For all y0 ∈ Im (A), there exists ε0 such that for all y ∈ Im (A)∩ B(y0, ε0), one has I(φ(y0))⊂I(φ(y))

Proof: To prove this lemma we will prove that for any y0 ∈ Im (A) and any sequence (yn)n∈N

tending to y0, any subsequence (yun)n∈N such that I(φ(yun)) = Jun = J is constant, the support J satisfiesJ ⊃I=I(φ(y0)).

Denote byx0=φ(y0)andI=I(x0). One hasx0(I) = (AtIAI)−1AtIy0=AI+y0. Let(yn)n∈Na sequence of elements ofIm (A)tending toy0andxn=φ(yn). Up to an extraction of a subsequence, one can suppose that for alln∈N,I(xn)is constant. Denote byJ this common support. ThenxnJ = (AtJAJ)−1AtJyn = A+Jyn. Letx= limn→∞xn andxJ =A+Jy0. Letzn be the sequence of vectors supported onI defined by znI =A+Iyn. By the definiton of zn, Azn = PVIyn. Let K be a set such that (ak)k∈K is a basis of Im (A), andvn the vector supported onK such thatvnK =A+K(PVI(yn)). One has

A(zn+vn) =yn and kzn+vnk16kznk1+kvnk1 Noticing that

kvnk1=

A+K(PVI(yn−y0)) 1 −→

n→∞0 One deduces that limn→∞kzn+vnk1 =

x0

1. Moreover limn→∞kxnk1 = kxk1 and since kxnk1 6 kzn+vnk1 we deduce that kxk1 6

x0

1 and finally x0 = x since x0 = φ(y0). Thus one has I = I(x0) =I(x)⊂J which concludes the proof of the lemma.

Lety0∈Im (A). There existsε0 >0 such that for ally ∈Im (A)∩ B(y0, ε0), there exists J satisfying I ⊂J. Moreover x=φ(y) is supported inJ andxJ =AJ+y. SinceI ⊂J, one also hasx0J =A+Jy0 and thus

x−x0 2=

A+J(y−y0)

2. One deduces that

∀y∈Im (A)∩ B(y0, ε0),

φ(y0)−φ(y) 26C

y0−y

2 withC= max

J,rank(AJ)=|J|λmax(A+J) which concludes the proof of the theorem.

References

[1] E.J. Candès and Y. Plan. Near ideal model selection vis l1 minimization. Ann Statit, 2007.

[2] Ch. Dossal, M.L. Chabanol, G. Peyré, and j. Fadili. Sharp suppôrt recovery from noisy random measurements by l1 minimization. Applied and Computational Harmonic Analysis, 2011.

[3] J.J. Fuchs. On sparse representations in arbitrary redundant bases. IEEE-I-IT, 2002.

[4] Wainwright. Sharp thresholds for high-dimensional and noisy sparsity recovery using 1-constrainted quadratic programming(lasso). IEEE Trans. Info. Theory, 55(5):2183–2202, 2009.

4

Références

Documents relatifs

In this paper, based on recent results in [2], we have presented necessary and sufficient condition for the solv- ability of local exponential synchronization of N identical

To study the well definiteness of the Fr´echet mean on a manifold, two facts must be taken into account: non uniqueness of geodesics from one point to another (existence of a cut

Our first result Theorem 1 establishes conditions allowing exact sparsity pattern re- covery when the signal is strictly sparse. Then, these conditions are extended to cover

We restrict our analysis to basis learning and identify necessary / sufficient / necessary and sufficient conditions on ideal (not necessarily very sparse) coefficients of the

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

Rademacher, Uber partielle und totale Differenzierbarkeit von Functionen mehrerer Variabeln und uber die Transformation der Doppelintegrale.. Weaver,

Our result is a sufficient condition for the convergence of a finite state nonhomogeneous Markov chain in terms of convergence of a certain series.. Definition

It is shown that this condition is sufficient for f to be Lipschitz at each point of X.. The correct form of the theorem on page 276 is