• Aucun résultat trouvé

Mass localization

N/A
N/A
Protected

Academic year: 2021

Partager "Mass localization"

Copied!
23
0
0

Texte intégral

(1)

HAL Id: hal-01163389

https://hal.archives-ouvertes.fr/hal-01163389

Preprint submitted on 12 Jun 2015

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de

Mass localization

Thibaut Le Gouic

To cite this version:

Thibaut Le Gouic. Mass localization. 2015. �hal-01163389�

(2)

Mass localization

Thibaut Le Gouic June 12, 2015

Abstract

For a given class F of closed sets of a measured metric space (E, d, µ), we want to find the smallest elementB of the classF such that µ(B)1α, for a given 0 < α <1. This setB localizes the massofµ. Replacing the measureµby the empirical measureµngives an empirical smallest set Bn. The article introduces a formal definition of small sets (and their size) and study the convergence of the setsBntoB and of their size.

Contents

1 Introduction 2

1.1 Definitions . . . . 4

1.1.1 Stable set . . . . 4

1.1.2 Size function . . . . 6

1.2 Overview of the main result . . . . 6

2 First properties 7 2.1 Existence . . . . 7

2.2 Regularity ofτ . . . . 8

3 Consistency 8 3.1 τ-tightness . . . . 8

3.2 τ-consistency . . . . 9

3.3 Minimizer consistency . . . . 10

3.4 Minimizer continuity . . . . 11

4 Examples 12 4.1 Examples of stable classes . . . . 12

4.1.1 Closed sets . . . . 12

4.1.2 Parametrized classes . . . . 12

4.1.3 ε-separated unions . . . . 12

4.2 Examples of size function . . . . 12

4.2.1 Packing . . . . 12

4.3 Examples of sequence of measures . . . . 13

5 Proofs 14

(3)

6 Conclusion 21

1 Introduction

The framework of our study is a measured metric space (E, d, µ). Mass localization intends to find in this setting a small Borel setB such thatµ(B)1αfor some given 0< α <1. The measure µ conditioned on B is a new measure that we say to be α-localized and denote µα. This article provides a definition of asmallest Borel set of probability 1αin order to obtain a localized version of the measure with thesmallest support possible.

This smallest Borel set represents intuitively the "essential part" of the measure. However, it seems difficult to give an universal definition of "smallest": although a ball centered on the origin as the smallest set with standard Gaussian measure onRdseems a good choice, it is not obvious to define such set if the measure is not unimodal or if it is not symmetric or if it is not even defined on an Euclidian space.

Consistency is an important property we want for our notion. In statistics, the measureµoften unknown, is usually approximated by a sequence of probability measures (µn)n≥1. The smallest closed set withµn-probability 1αshould become closer to thesmallest one ofµ-probability 1α asngrows.

Several methods have been studied in order to define such sets.

A first method is to choose a classF of subsets of E partially ordered by their volume and to pick the smallest set (for this order) of this class with a µ-probability greater than 1α. This set corresponds to the level sets of a density functionf wheneverµis absolutely continuous with respect to the Lebesgue measure and the classF contains the level sets. An other way to define this set is to maximize

µ(B)βλ(B), (1)

overB ∈ F, where λis the Lebesgue measure andµ({f β}) = 1α. This notion is known as excess mass. Denote byBβ the maximizer of (1) and by Bβn the maximizer of

µn(B)βλ(B),

for µn the empirical measure. It is then of interest to determine if Bnβ converges to Bβ and to exhibit a rate of convergence in this case.

The article [Har87] considers the case ofF being the set of all convex sets ofR2and proves that the Hausdorff distancedH(Bn, B) betweenBnβ andBβ converges to 0 and satisfies

dH(Bnβ, Bβ) =O logn

n 2/7

.

The article [Nol91] considers setsFas the set of all ellipsoids. Consistency ofBβn is proven, as well as the following limit theorem. Letcnandcbe the centers of the ellipsoidsBnβ andBβ respectively, and letσn andσbe the vector containing the entries of the matrix defining the ellipsoidsBβn and Bβ respectively, then, if the level sets of the measureµare ellipsoids,

n1/3(cnc, σnσ)

(4)

is weakly converging to the maximum of a Gaussian process. [Pol97] studies a more general case, with a different notion of convergence, and showed the consistency ofBβn for the pseudo-distance

dµ(F, G) =µ(FG),

wheredenotes the symmetric difference, whenever the classF is a Glivenko-Cantelli class. Under several hypotheses including that the level sets of the measure µ belongs to F and regularity conditions onµ, the article obtains the following rate of convergence

dµ(Bβ, Bβn) =O(n−δ),

for a constantδdepending on the regularity ofµ. This excess mass approach leads to rather precise results in many cases. However, it comes with few drawbacks, such as the condition thatF must contains the level sets of the unknown measure µ, which requires a certain knowledge on theµ.

Requirements on the regularity of µ can also be unsatisfactory for some applications. Also, this approach is restricted to the case of spaces with finite dimension (and oftenRd).

A second method comes from the notion of trimming onRextended toRd. OnR, the smallest set ofµ-probability 1αis defined as

[F−1(α/2);F−1(1α/2)],

where F is the cumulative distribution function of µ. Replacing F by the empirical cumulative distribution function Fn defines the empirical smallest set. Extension to Rd can be done in the following way: Cαdenotes the intersection of all the closed half spaces ofµ-probability greater than 1α. Cα is then a non-empty convex set forα <1/2, if the measureµis regular enough. [Nol92]

deals with the rate of convergence ofCn, defined similarly with the empirical measureµnand shows its consistency. In order to quantify the rate of convergence ofCn toCα, the article introduce the following random functions

rn(u) = inf{r0;ru /Cn}, and

rα(u) = inf{r0;ru /Cα},

and establishes the weak convergence to a Gaussian process defined on the unit sphereSd−1 of the

process

n(rnrα), under regularity conditions on the density function ofµ.

The article [CAGM97] presents another method, called α-trimmed k-means, which introduces very few arbitrary parameters. This method chooses the support of theα-localized measure ν as the one minimizing the distortion to its bestk-quantifier. Formally, for a given function Φ, and a given integerk, the method consists in choosing

Bαarg min

{m1,...,minfk}⊂Rd

Z

B

Φ

1≤i≤kinf kXmik

dµ;µ(B)1α

.

After proving the existence of such minimizer, the article [CAGM97] shows the consistency ofBα: if (µn)n≥1 weakly converges to an absolutely continuous measureµthen, for any choice of

Bnαarg min

{m1,...,minfk}⊂Rd

Z

B

Φ

1≤i≤kinf kXmik

n;µn(B)1α

,

(5)

the sequence (Bαn)n≥1converges toBα(when unique) for the Hausdorff metric. Theses results hold onRd.

The main goal of our article is to provide a new definition that is intuitive and avoid usual hypotheses, that remains consistent.

1.1 Definitions

We define a notion ofsmallestclosed set and introduce some properties that will help to understand its meaning. The framework of the definition aims to be fairly general. (E, d) is a Polish space (metric, separable and complete space) andµis a Borel measure on (E, d). Asmallest set will be defined as the minimizer of a functionτ defined on a classF of closed subsets ofE.

1.1.1 Stable set

In order to ensure the existence of the smallest set in a classF of sets, the class needs to be stable in some way. The following definition of such stability will be an assumption made on the class.

Let first set the following notation.

For a given setB andε >0, the setBεis theε-neighborhoodofB:

Bε:={xE;yB, d(x, y)< ε}.

Definition 1(Stable set). Let (Bn)n≥1 be a sequence of closed sets, denote limnBn the set limn Bn:= \

ε>0

[

k≥1

\

n≥k

Bnε.

LetF be a class of closed sets ofE. F isstable ifE∈ F and (Bn)n≥1⊂ F =⇒ ∃(nk)k≥1, nk→ ∞,lim

k Bnk ∈ F.

This notion of stability is close to the completeness under Hausdorff convergence. Indeed, it is strictly equivalent if the metric space (E, d) is compact, as it will be discussed in the next remarks.

As we definedFas a subset of the closed sets of (E, d), we first check that our notion of stability makes sense for a class of closed sets.

Remark 2. Given a sequence of sets(Bn)n≥1,limnBn is always closed. Indeed, denoteB(x, ε/2) the ball centered in xof radius ε/2,

x /lim

n Bn ⇔ ∃ε >0,k1,nk, x /Bnε (2)

⇒ ∃ε >0,k1,nk, B(x, ε/2)Bnε/2= (3)

⇒ ∃ε >0, B(x, ε/2)lim

n Bn=. (4)

In other words,(limnBn)c is open, and limnBn is thus closed.

The following remark aims to clarify stability.

(6)

Remark 3. When(Bn)n≥1 is converging to B for the Hausdorff metric, then limn Bn=B.

Indeed, denoteεk the smallestε >0 such that BnBε andBBnε for all nk, then, B= \

ε>0

[

k≥1

\

n≥k

B \

ε>0

[

k≥1

\

n≥k

Bnεk= lim

n Bn \

ε>0

[

k≥1

\

n≥k

Bε+ε k=B.

In a more general setting, given a sequence of closed balls(Kk)k≥1 such thatk≥1Kk=E, and given a sequence(Bn)n≥1, if there exists B such that for any k1, the sequence(BnKk)n≥1

converges in Hausdorff metric toBKk, then limn Bn=B.

Remark 4. In a metric space(E, d)such that every bounded closed set is compact (this is the case for instance, of locally compact length spaces), it is easier to understand the meaning of stability of a class. For any sequence (Bn)n≥1 of closed sets of E, and any sequence(Kk)k≥1 of increasing closed balls such that k≥1Kk =E, there exist a set B and a subsequence (relabeled (Bn)n≥1) such thatBnKk converges in Hausdorff metric toB and

limn Bn=B.

In this case, a stable class in the sense of definition 1 is just a compact class for the Hausdorff convergence on large balls. Indeed, in such spaces E, there exists an increasing sequence of com- pacts (Kk)k≥1 such that k≥1Kk =E, take for instance a sequence of balls centered on the same point, with an increasing radius. The Hausdorff convergence on large balls is then equivalent to the Hausdorff convergence of(BnKk)n≥1for anykN. Since the closed sets inKk forms a compact class for the Hausdorff convergence, there exists a subsequence of(BnKk)n≥1 converging to some Bk . Using diagonal argument, we may extract a subsequence of the original sequence(Bn)n≥1 such that for anykN,(BnKk)n≥1 converges in Hausdorff metric toBk. It is easily checked that B:=kBk is a limit of a subsequence of (Bn)n≥1, in the sense of definition 1.

Let us introduce some examples of stable sets.

Example 5. TakeF as the set of all closed sets. Stability is then obvious since the limit considered in the definition of a stable set is always closed as shown in the remark 2.

Example 6. The set of all balls is generally not stable, but is does not take much to make is stable.

The set of all closed balls and half spaces inRd is a stable class. This assertion can be proved using parametrization of the center of the balls in spherical coordinates and using compactness of spheres.

Example 7. Other shapes of sets ofRd make stable classes. Ellipsoids, rectangles, or convex bodies with bounded diameter (by some fixedR <) all form stable classes. And it is possible to get rid of the bounded diameter by adding some sets to the class.

Example 8. IfF is a stable class of convex sets of a metric space(E, d)such that closed balls are compacts, then

Fε:={∪F∈GF;G ⊂ F,F, G∈ G inf

x∈F,y∈Gd(x, y)ε}, is also a stable class (see lemma 38).

(7)

1.1.2 Size function

As we aim to define a smallest set of the sets ofF, we need to define a notion of size. This is done using a functionτ, meant to measure thesizeof a set. In order to localize the mass, we will thus minimize the size of a set, among all sets given a probability measure.

In order to express our assumptions onτ, we first define theHausdorff contrast.

Definition 9(Hausdorff contrast). LetAandBbe two closed subset of a Polish space(E, d). The Hausdorff contrastbetween AandB is defined by

Haus(A|B) := inf{ε >0|ABε}.

We can then remark that the Hausdorff metric dH(A, B) between two closed sets A andB is then

dH(A, B) = Haus(A|B)Haus(B|A).

We now define formally a size function.

Definition 10 (Size function). Let (E, d) be a metric space. A function τ :F → R+ is called a size functionif it satisfies the three following conditions:

(H1) τ is increasing, i.e. AB = τ(A)τ(B),

(H2) for any decreasing sequence (An)n≥1 ⊂ F such that τ(A1)< and Haus(An| ∩kAk)0, the following holds τ(An)τ(nAn),

(H3) for any sequence(An)n≥1⊂ F,τ(limnAn)lim infnτ(An).

Hypothesis (H2) on the size function requires some Hausdorff contrast. This particular choice make the hypothesis weaker and allow the hypothesis to hold for size function that give finite size to non compact sets. The consequences of these hypothesis will be more detailed in the sequel of the paper.

1.2 Overview of the main result

Our main result states that under the condition (H1), (H2) and (H3), for the empirical measure µn, and a stable classF,

ταlim inf

n τnαlim sup

n

τnαlim

ε τα−ε, whereτnα= min{τ(B);B∈ F, µn(B)1α}.

It implies the convergence ofτnαwhen.7→τ.is continuous.

The result actually holds for a wider class of sequence of measures (µn)n≥1.

Moreover, simple conditions on the sequence imply the convergence of the minimizers of theτnα for different metrics (depending on the conditions assumed). This is discussed in the next sections.

(8)

2 First properties

2.1 Existence

Let us recall the setting. (E, d) is a Polish space andµ is a Borel probability measure on (E, d).

Given a size functionτ, a stable classF of closed sets ofE, and a level α, we define the support Bαof theα-localized measureµαofµby - when possible:

Bαarg min{τ(A);A∈ F, µ(A)1α}, and set

µα=µ(.|Bα).

Our first concern is whetherBαexists. It is the matter of the next result.

Theorem 11 (Existence of a minimum). Let (E, d) be a Polish space, F a stable class and µ a probability measure on (E,B(E)). Set 0 < α <1. Suppose (H3). Then, there exists B ∈ F such that

Barg min{τ(A);A∈ F(E), µ(A)1α}.

Remark 12. Hypothesis (H3) can not just be omitted. Indeed, if τ(B) is defined as the Lebesgue measure of the closure ofB on Rd, take

µ=αγd+ (1α)q,

whereqis a probability measure supported on Qd andγd is the standard Gaussian measure onRd, then, the sequence(Bn)n≥1 defined by

Bn:={xk}1≤k≤nB(0, rn),

with {xn}n≥1 =Qd and rn 0 so that µ(Bn) = 1α, is a minimizing sequence. And τ(Bn) = τ(B(0, rn))so that τα= 0butτ Qd

= +.

The minimizer is not necessarily unique. This seems natural with the following example. Take µas the uniform law on the unit square and an isometric τ. Then any translation small enough of the minimizer will have the same size and the same measure, and will thus be another mini- mizer. Another result (corollary 25) will comfort us proving that minimizers form a compact set for Hausdorff metric.

The stability condition onF is needed for existence of the minimum. However, it can be lightly weakened.

Remark 13 (On stability of F). Since the minimal size min{τ(A);A ∈ F, µn(A) 1α} is bounded ifτα= min{τ(A);A∈ F, µ(A)1α}<, then we may suppose instead of stability of F that all the classes

FM :=F ∩ {A;τ(A)M}

for M < are stable. It is a weaker notion since τ(limBn) lim infτ(Bn) for any sequence (Bn)n≥1 inF, under (H3).

(9)

2.2 Regularity of τ

Denote

τα= inf{τ(A);A∈ F, µ(A)1α, B}.

It seems natural to expectα7→ταto be continuous whenµis regular enough. It also seems natural, for instance, to haveBα growing continuously when αdecreases to zero, for a unimodal measure µ. This is the concern of this paragraph, the first one establishing the right continuity.

Proposition 14 (Right continuity). Let (E, d) be a Polish space, µ a probability measure on (E,B(E)) and F a stable class. Let 0 < α < 1. Then, under (H3), α 7→ τα is right continu- ous.

The continuity will require some more hypotheses as shows the following example of discontinu- ity. Takeµ= (δx+δy)/2 andα= 1/2, and it is not difficult to find someτ that is not continuous onα.

Thus, it is clear that continuity property of this function needs regularity on the measure we want to localize, with respect to the classF. This is why we introduce the notion ofF-regularity.

Definition 15 (F-regularity). A probability measure µ is said to beF-regular if for all B ∈ F, anyδ >0and any C∈ F such thatBC andµ(B)< µ(BδC), there exists A∈ F such that

ABδC, µ(B)< µ(A).

The only purpose of this notion is the continuity of the applicationα7→τα. It is restrictive onµ only whenF is not rich enough. TakingF as the class of all closed sets ofEmake any probability measure F-regular. Indeed, sinceµ(Bδ C) = limnµ(Bδ−1/nC), there exists n1 such that µ(B)< µ(Bδ−1/nC) and then we can choose A:=BδC. On the other hand, ifF is not rich enough so thatτ(F) is not even connected, it is easy to build a measureµthat is notF-regular.

Proposition 16(Continuity). Letµbe a probility measure on a Polish space(E, d). Suppose (H1), (H2) and (H3), and thatµisF-regular, has a connected support and thatτα is finite for anyα >0 then, the mapping α7→τα is continuous.

Remark 17. The conditionτα<just avoids a degenerated case.

This continuity condition is a first step toward the main matters of our article, the consistency.

3 Consistency

3.1 τ-tightness

In order to show the consistency of the mass localization when a sequence of measures (µn)n≥1

converges to a measureµ, we must make some assumptions on the sequence of measures. The first and most important hypothesis for consistency is theτ-tightness.

Definition 18 (τ-tightness). A sequence of random probability measures n)n≥1 almost surely weakly converging to a measureµ isτ-tightif for anyδ >0 and any B∈ F such that τ(B)<,

(10)

almost surely, for anyC∈ F such thatBC andµ(B)lim infnµn(C), there existsA∈ F such that

µ(B)lim inf

n µn(A), BABδC, τ(A)<.

An important remark on this definition is that a τ-tight sequence of random measures does not have necessarily almost surelyτ-tight realizations. This can happen to empirical measures for instance. This subtlety lies in the position of "almost surely" in the definition, that is, after the choice ofB andδmade.

We can also remark the following. Inequalityµ(B)lim infnµn(C) is not a consequence ofB C. Indeed, the portmanteau theorem states lim supnµn(C) µ(C) and lim supnµn(B) µ(B) sinceBandCare closed. The conditions forτ-tightness onB∈ F such thatµ(B) = limnµn(B) is clearly verified forA:=B. The definition ofτ-tightness can be understood as follows. Whenever n)n≥1 does not catch all the µ-mass of B (i.e. lim infnµn(B) < µ(B)) but some set C that containsB has itsµ-mass well caught (i.e. µ(B)lim infnµn(C)), then F must have an element Athat also have itsµ-mass well caught (i.e. µ(B)lim infnµn(A)), of finite size (i.e. τ(A)<) and that is stuck betweenB and aδ-neighborhood ofB intersected withC, for smallδ.

The following proposition states that this notion is not empty, and includes the empirical mea- sures.

Proposition 19(τ-tightness of the empirical measure). Let µbe a probability measure onE such thatτα<for 0< α <1. Let(Xi)i≥1 be a sequence of i.i.d. random variables with common law µ. Setµn= 1nP

1≤i≤nδXi. Then,n)n≥1 isτ-tight.

The empirical measure is actually not the only simple example ofτ-tight sequence. The following corollary gives a simple condition for a sequence of random probability measures to beτ-tight.

Corollary 20. Letn)n≥1 be a sequence of random probability measure onEalmost surely weakly converging to some measureµ, such that τα <, for any 0 < α < 1. If for all B ∈ F, almost surely,µ(B)limnµn(B), thenn)n≥1 isτ-tight.

This corollary says that τ-tightness is implied by almost sure convergence ofµn(B) for each B∈ F and thus, dropping the "almost sure" makes theτ-tightness much more restrictive.

We can now state our first result on consistency.

3.2 τ-consistency

Our goal is to show that whenµn converges to µ, the size τnα of the smallest element of a given classF withµn-mass at least 1αconverges to the sizeταof the smallest ofµ-mass at least 1α.

In other words, we want to prove consistency of the smallest sizeτα. The following theorem states conditions for this consistency to hold.

Theorem 21 (Consistency). Let (E, d) be a Polish space, F a stable class and n)n≥1 a τ- tight sequence of random probability measures on (E,B(E)) almost surely weakly converging to some measure µ. Set 0 < α < 1. Choose any Bnα arg min{τ(A);A ∈ F, µn(A) 1α} and µαn=µn(.|Bαn), for alln1. Then, under hypotheses (H1), (H2) and (H3), the sequenceαn)n∈N

(11)

is almost surely totally bounded for the weak convergence topology andBα := limkBnk along any converging subsequencenk)k≥1 of n)n≥1 satisfiesµ(Bα)1αand almost surely

τατ(Bα)lim inf

n→∞ τ(Bαn)lim sup

n→∞ τ(Bnα) lim

ε→0+τα−ε.

Moreover, if µisF-regular and its support is connected, the five terms above are equal.

Note that theτ-tightness condition is required only for the last inequality.

It is rather clear the if α 7→ τα is not continuous for the measure µ, we can hardly expect consistency of the smallest sizeτα. This first step of consistency brings us to consider consistency of the smallest set of the class itself.

3.3 Minimizer consistency

The smallest set inFwithµ-mass greater than 1αis not always unique, and therefore consistency does not just mean that minimizer forµn converges to the minimizer forµ. In order to give a sense to consistency, we will consider the set of all minimizers and the Hausdorff contrast between sets of elements ofF(for some underlying metricdFonF). We thus first recall the definition of Hausdorff contrast. LetAandB be two sets. The Hausdorff contrast betweenA andB is defined by

Haus(A|B) := inf{ε >0|ABε}.

Let us denote, for 0< α <1, a sequence of measures (µn)n≥1 and a measureµ;

Snα= arg min{τ(A);A∈ F, µn(A)1α}, and

Sα= arg min{τ(A);A∈ F, µ(A)1α}.

The setsSαandSnα are thus two subsets ofF. What we want is to find conditions under which Haus(Snα|Sα)0,

whenntends to infinity.

We now state and comment briefly the two hypotheses that will be made for our main result.

(H4) · 7→τ·is continuous atα,

(H5) (An)n≥1⊂ F such thatτ(limnAn)<,

limτ(An) =τ(limAn) = dF(An,lim

k Ak)0

where dF denotes a metric on F. A typical example of such metric is the Hausdorff metric or the measure of symmetric difference. Section 4.2 is devoted to these examples and conditions that imply (H5).

The continuity condition (H4) is a consequence of the proposition 16: a connected support for anF-regular measure suffices.

We can now state a direct consequence of theorem 21 and hypotheses (H4) and (H5).

Références

Documents relatifs

We compute in particular the second order derivative of the functional and use it to exclude smooth points of positive curvature for the problem with volume constraint.. The

The results proven in this paper provide a first a priori analysis of the greedy algorithm for the selection of the elements used in the reduced basis for the approximation of

The first remark in this direction, which follows from [16] (see [16 a] also), is that there is a residual set ^CDiff^M 2 ) such that if f^SS and A is a wild hyperbolic set for/

The history of previous attempts to obtain a stronger result than Siegers is a curious chapter of accidents, and will be briefly summarised here before

functions reuches the extreme bound of 9 possible simplification.. function belongs to

We prove that the strong detonation travelling waves for a viscous combustion model are nonlinearly stable, that is, for a given perturbation of a travelling wave, the

After elementarily studying the above mentioned class of Heegner points, upon imposing certain conditions (see Theorem 2.3 and Proposition 2.4), given a complex elliptic curve E,

Stability of representation with respect to increasing lim- its of Newtonian and Riesz potentials is established in ([16], Theorem 1.26, Theorem 3.9).. Convergence properties