• Aucun résultat trouvé

The Normalized Graph Cut and Cheeger Constant: from Discrete to Continuous

N/A
N/A
Protected

Academic year: 2021

Partager "The Normalized Graph Cut and Cheeger Constant: from Discrete to Continuous"

Copied!
32
0
0

Texte intégral

(1)

HAL Id: hal-00473264

https://hal.archives-ouvertes.fr/hal-00473264v3

Submitted on 9 Jun 2011

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

The Normalized Graph Cut and Cheeger Constant:

from Discrete to Continuous

Ery Arias-Castro, Bruno Pelletier, Pierre Pudlo

To cite this version:

Ery Arias-Castro, Bruno Pelletier, Pierre Pudlo. The Normalized Graph Cut and Cheeger Constant: from Discrete to Continuous. Advances in Applied Probability, Applied Probability Trust, 2012, 44 (4), pp.907-937. �10.1239/aap/1354716583�. �hal-00473264v3�

(2)

The Normalized Graph Cut and Cheeger Constant:

from Discrete to Continuous

Ery Arias-Castro

, Bruno Pelletier

and Pierre Pudlo

June 9, 2011

Abstract. Let M be a bounded domain of Rd with smooth boundary. We relate the Cheeger

con-stant of M and the conductance of a neighborhood graph defined on a random sample from M. By restricting the minimization defining the latter over a particular class of subsets, we obtain consis-tency (after normalization) as the sample size increases, and show that any minimizing sequence of subsets has a subsequence converging to a Cheeger set of M.

Index Terms: Cheeger isoperimetric constant of a manifold, conductance of a graph, neighborhood

graph, spectral clustering, U-processes, empirical processes.

AMS 2000 Classification: 62G05, 62G20.

1

Introduction and main results

The Cheeger isoperimetric constant may be defined for a Euclidean domain as well as for a graph. In either case it quantifies how well the set can be bisected or ‘cut’ into two pieces that are as little connected as possible. Motivated by recent developments in spectral clustering and computational geometry, we relate the Cheeger constant of a neighborhood graph defined on a sample from a domain and the Cheeger constant of the domain itself.

Given a graph G with weights {δi j}, the normalized cut of a subset S ⊂ G is defined as

h(S ; G) = σ(S )

min{δ(S ), δ(Sc)}, (1.1)

where Sc denotes the complement of S in G, and

δ(S ) =X i∈S X j,i δi j, σ(S ) = X i∈S X j∈Sc δi j, (1.2)

Department of Mathematics, University of California, San Diego, USA

D´epartement de Math´ematiques, IRMAR – UMR CNRS 6625, Universit´e Rennes II, FranceD´epartement de Math´ematiques, I3M – UMR CNRS 5149, Universit´e Montpellier II, France

(3)

are the discrete volume and perimeter of S . The Cheeger constant or conductance of the graph G is defined as the value of the optimal normalized cut over all non-empty subsets of G, i.e.

H(G) = min{h(S ; G) : S ⊂ G, S , ∅}. (1.3)

A corresponding quantity can be defined for a domain of a Euclidean space. Let M be a bounded domain (i.e. open, connected subset) of Rd with smooth boundary ∂M. For an integer 1 ≤ k ≤ d, let Volk denote the k-dimensional volume (Hausdorff measure) in Rd. For an open subset A ⊂ Rd,

define its normalized cut with respect to M by

h(A; M) = Vold−1(∂A ∩ M)

min{Vold(A ∩ M), Vold(Ac ∩ M)}

,

where Ac denotes the complement of A in Rd and with the convention that 0/0 = ∞. The Cheeger

(isoperimetric) constant of M is defined as

H(M) = inf{h(A; M) : A ⊂ M}.

Equivalently, the infimum may be restricted to all open subsets A of M such that ∂A∩ M is a smooth submanifold of co-dimension 1. This quantity was introduced by Cheeger [15] in order to bound the eigengap of the spectrum of the Laplacian on a manifold. A Cheeger set is a subset A ⊂ M such that h(A; M) = H(M); there is always a Cheeger set and it is unique under some conditions on the domain M [12]. For A ⊂ M, we call ∂A ∩ M its relative boundary.

1.1

Consistency of the normalized cut

Suppose that we observe an i.i.d. random sample Xn = (X1, . . . , Xn) from the uniform distribution

µ on M. For r > 0, let Gn,r be the graph with nodes the sample points and edge weights δi j =

1{kXi − Xjk ≤ r}, which is an instance of a random geometric graph [33]. Let ωd denote the

d-volume of the unit d-dimensional ball, and define

γd =

Z

Rd

max hu, zi, 0 1{kzk ≤ 1} dz, (1.4)

where u is any unit-norm vector of Rd. Actually γ

d is the average volume of a spherical cap when

the height is chosen uniformly at random. We establish the pointwise consistency of the normalized cut, which yields an asymptotic upper bound on the Cheeger constant of the neighborhood graph based on the Cheeger constant of the manifold. This is the first result we know of that relates these two quantities.

Theorem 1. Let A be a fixed subset of M with smooth relative boundary. Fix a sequence rn → 0

with nrnd+1/ log n → +∞, and let Sn= A ∩ Gn,rn. Then with probability one

ωd γdrn h(Sn; Gn,rn) → h(A; M), and, consequently, lim sup n→∞ ωd γdrn H(Gn,rn) ≤ H(M).

(4)

We do not know whether the Cheeger constant of the neighborhood graph, for an appropriate choice of the connectivity radius and properly normalized, converges to the Cheeger constant of the domain.

1.2

Consistent estimation of the Cheeger constant and Cheeger sets

We obtain a consistent estimator of the Cheeger constant H(M) by restricting the minimization defining the conductance of the neighborhood graph (1.3) to subsets associated with subsets of Rd with controlled reach. The reach of a subset S ⊂ Rd [20], denoted reach(S ), is the supremum over

η > 0 such that, for each x within distance η of S , there is a unique point in S that is closest to

x. We assume here that M ⊂ (0, 1)d. When this is not known and/or not the case, we may always

infer a hypercube that contains M—by taking a hypercube containing all the data points, with some lee-way so that the hypercube contains M with high probability when the sample gets large—and then rescale and translate the points so that M is within the unit hypercube. So this assumption is really without loss of generality.

Theorem 2. Assume that M ⊂ (0, 1)d and that r

n → 0 such that nr2d+1n → ∞. Let ρn → 0 slowly

so that rn = o(ραn) and nr2d+1n ραn → ∞ for all α > 0. Let Rn be a class of open subsets R ⊂ (0, 1)d

such that reach(∂R) ≥ ρn. Define the functional h

nover Rnby

hn(R) = ωd

γdrn

h R ∩ Xn; Gn,rn

if both R and Rccontain a ball of radius ρncentered at a sample point, and h

n(R) = ∞ otherwise.

(i) With probability one,

min

R∈Rnh

n(R) → H(M), n → ∞.

(ii) Let {Rn} be a sequence satisfying

Rn ∈ Rn, hn(Rn) = min{hn(R) : R ∈ Rn}. (1.5)

Then with probability one, {Rn ∩ M} admits a subsequence converging in the L1-metric.

Moreover, any subsequence of {Rn∩ M} converging in the L1-metric converges to a Cheeger

set of M.

Note that the infimum defining Rn in (1.5) is attained in Rn since the function h

n takes only a

finite number of values.

Part (ii) of Theorem 2 hints at a consistent estimate of a Cheeger set of M, but Rn∩ M depends

on M, which is unknown. On the other hand reconstructing an unknown set from a random sam-ple of it is an independent problem for which there exists multisam-ple techniques and an important literature—see e.g., [6] and the references therein. In the following result we construct a random discrete measure which does not require the knowledge of M, and prove that, seen as a sequence of random measures indexed by the sample size n, any accumulation point is the uniform measure on a Cheeger set of M.

(5)

Theorem 3. Let {Rn} be a sequence as in Theorem 2-(ii), {Rnk} a subsequence of {Rn} with Rnk

M → Ain L1. Define the random discrete measure Qn = 1nPni=11Rn(XiXi and the measure

Q = 1A(.)µ. Then, that Qnconverges weakly to Q is an event which holds with probability one.

As an example of an estimate of a Cheeger set of M, one can consider a union of balls of radius

κn centered at the observations falling in Rn. Under appropriate conditions, it is known that this

estimate converges in L1; see [6].

Let us mention that with our result, only the “regular” part of a Cheeger set can be recon-structed. Indeed, in dimension d ≥ 8, the boundary of a Cheeger set is not necessarily regular and may contain parts of codimension greater than 1.

1.3

Connections to the literature

Our results relating the respective Cheeger constants of a domain and of a neighborhood graph defined from a sample from the domain are the first of their kind, as far as we know. The con-nections to the literature stem from the concept of normalized cut taking a central place in graph partitioning and related methods in clustering; from a recent trend in computational geometry (and topology) aiming at estimating geometrical (and topological) attributes of a set based on a sample; and from the fact that we can use the conductance to bound the mixing time of a random walk on the neighborhood graph.

Clustering. In spectral graph partitioning, the goal is to partition a graph G into subgraphs

based on the eigenvalues and eigenvectors of the Laplacian [36, 16]. It arises as a convex relaxation of the combinatorial search of finding an optimal bisection in terms of the normalized cut. Given a set of points X1, . . . , Xn and a dissimilarity measure (or kernel) φ, spectral clustering applies

spectral graph partitioning to the graph with nodes the data points and edge weight δi j = φ(Xi, Xj)

between Xi and Xj [37]. For instance, if the points are embedded in a Euclidean space, the kernel

φ is often of the form φ(x, y) = ψ(kx − yk/σ), where σ is a tuning parameter, and ψ is, e.g., the

Gaussian kernel ψ(t) = exp(−t2) or the simple kernel ψ(t) = 1

[0,1](t) [30, 3]. The consistency of

spectral methods has been analyzed in this context [38, 32, 4, 21, 35]. In particular, [28] proves a result similar to our Theorem 1 in that context.

About cuts, [27] also proves a result similar to our Theorem 1 when the separating surface

∂A is an affine hyperplane. Closer to our Theorem 2, [29] establishes rates for learning a cut for

classification purposes—so the setting there is that of supervised learning, with each sample point

Xi associated with a class label Yi.

Computational geometry (and topology). The Cheeger constant H(M), and Cheeger sets,

are bona fide geometric characteristics of the domain M that we might want to estimate, follow-ing a fast developfollow-ing line of research around the estimation of some geometric and topological characteristics of sets from a sample, e.g., the number of connected components [5], the intrinsic dimensionality [26] and, more generally, the homology [31, 10, 11, 41, 14, 34, 13]; the Minkowski content [17], as well as the perimeter and area (volume) [8].

Random walks. Random geometric graphs are gaining popularity as models for real-life

net-works. Some protocols for passing information between nodes amounts to performing a random walk and it is important to bound the time it takes for information to spread to the whole network;

(6)

see [2] and references therein. It is well-known that, given a graph G, a lower bound on H(G) may be used to bound the mixing time the random walk on G. This is the path taken in [7, 2] when

M is the unit hypercube and the graph is Grn,n. However, in both papers the authors reduce the

setting to that of a regular grid without rigorous justification, leaving the problem unresolved (in our opinion) even in this particular case.

1.4

Discussion

As we saw, there are only a handful of other papers relating cuts in neighborhood graphs and cuts in the corresponding domain from which the points making the neighborhood graph where sampled from. Our paper is the first one we know of that establishes a relationship between the Cheeger constant (optimal normalized cut) on the neighborhood graph and the Cheeger constant of the domain, and the first one to propose a method that is consistent for the estimation of the latter based on a restricted normalized cut, and also consistent for the estimation of Cheeger sets. Our results generalize with varying amount of effort to other related settings. However, we leave important questions behind.

Generalizations. With some additional work, our results and methodology extend to settings

where the kernel (here the simple kernel) is fast decaying and where the data points are sampled from a probability distribution on M that has a non-vanishing density with respect to the uniform distribution. It would also be interesting to consider the setting where M is a d-dimensional smooth submanifold embedded in some Euclidean ambient space. Our arguments seem to carry through using a set of charts for the manifold M, as is done in [9, Lem. 3.4].

Refinements. Though we focused on sufficient conditions for rnto enable a consistent

estima-tion of the Cheeger constant of the domain, it may also be of interest to find necessary condiestima-tions. Partial work suggests that nrd

n → ∞ is necessary, and may be sufficient the divergence to infinity

is faster than a sufficiently large power of log n. The arguments in support of this, however, are substantially different than those we use in the paper, which hinge on Hoeffding’s inequality for

U-statistics.

An open problem. Whether the normalized Cheeger constants of some sequence of

neigh-borhood graphs converges to the Cheeger constant of the domain is an intriguing question. To paraphrase the question we leave open, is there a sequence {rn} such that, with probability one,

lim

n→∞

ωd

γdrn

H(Gn,rn) = H(M)?

A positive answer would establish the consistency of the normalized cut criterion for graph parti-tioning. Also, a lower bound on H(Gn,rn) would provide a lower bound on the eigengap between

the first and second eigenvalue of the Laplacian, which in turn may be used to bound the mixing time of the random walk on Gn,rn, as done in [7, 2] when M is the unit hypercube.

Consistent estimation in polynomial time. Our estimation procedures, though theoretically

valid and consistent, are not practical. It would be interesting to know whether there is a consistent estimator for the Cheeger constant that can be implemented in polynomial-time. Note that com-puting the Cheeger constant of a graph is NP-hard (which motivates the use of spectral methods),

(7)

and even the best polynomial-time approximations we are aware of are not precise enough to allow for consistency [1].

1.5

Content

The rest of the paper is devoted to the proofs of the three theorems. In Section 2, we establish the convergence of the discrete volume and perimeter to their continuous counterparts of a fixed subset of M with smooth relative boundary, using Hoeffding’s inequality for U-statistics [24]. Then, by the lower semi-continuity of the map A 7→ h(A; M), we deduce the supremum-limit bound of Theorem 1. In Section 3, we prove Theorems 2 and 3 by utilizing results on empirical

U-processes [18] on the one hand, and compactness properties of the L1-metric [23] on the other

hand.

1.6

Notation and background

The uniform measure on M is denoted µ, so that µ(A) = Vold(A ∩ M)/ Vold(M); and the normalized

perimeter is denoted ν(A) = Vold−1(∂A ∩ M)/ Vold(M). Let τM = Vold(M), and define the discrete

volume and perimeters as

µn(A) = τM ωdn(n − 1)rnd δ(A ∩ Xn; Gn,rn), νn(A) = τM γdn(n − 1)rd+1n σ(A ∩ Xn; Gn,rn), (1.6)

where δ, σ are given in (1.2), Xnis the sample, and Gn,rn the neighborhood graph. Also, define the

discrete ratio

hn(A) =

νn(A)

min(µn(A), µn(Ac))

,

and note that

hn(A) =

ωd

γdrn

h(A ∩ Xn; Gn,rn),

where h is given in (1.1). For further reference, we define the volume πd(η) of a spherical cap at

height η by

πd(η) = Vold x : kxk ≤ 1 and hu, xi ≥ η ,

where u is any unit-norm vector of Rd. Note that the constant γddefined in (1.4) may be expressed

as

γd =

Z 1

0

πd(η)dη.

The reach coincides with the condition number introduced in [31] for submanifolds without boundary, and the property reach(∂A) ≥ r is equivalent to A and Ac being both r-convex [39], in

the sense that a ball of radius r rolls freely inside A and Ac. (We say that a ball of radius r rolls freely in A if, for all p ∈ ∂A, there is x ∈ A such that p ∈ ∂B(x, r) and B(x, r) ⊂ A.) It is well-known that the reach bounds the radius of curvature from below [20, Thm. 4.18]. In particular, if reach(∂A) > 0, then ∂A is a smooth submanifold (possibly with boundary).

In the rest of the paper, the generic constant C may vary from line to line, except when stated explicitly otherwise.

(8)

2

Proof of Theorem 1: Consistency of the normalized cut

For a subset A of M and a real number r > 0, define the symmetric kernel

φA,r(x, y) = 1 2 n 1A(x) + 1A(y) o 1{kx − yk ≤ r}, (2.1)

so that µn(A) may be expressed as the following U-statistic:

µn(A) = τM ωdn(n − 1)rnd X i, j φA,rn(Xi, Xj).

Similarly, νn(A) may be written as

νn(A) = τM γdn(n − 1)rnd+1 X i, j ¯ φA,rn(Xi, Xj).

with the symmetric kernel ¯ φA,r(x, y) = 1 2 n 1A(x)1Ac(y) + 1A(y)1Ac(x) o 1{kx − yk ≤ r}. (2.2)

We shall need the following Hoeffding’s Inequality for U-statistics [24], which is a special case of [18, Thm. 4.1.8].

Theorem 4. Let φ be a measurable, bounded kernel on Rd×Rdand let {X

k : k ∈ N} be i.i.d. random

vectors in Rd. Assume that Eφ(X

1, X2) = 0 and that b := kφk∞< ∞, and let σ2 = Var(φ(X1, X2)). Then, for all t > 0,

P         1 n(n − 1) X i, j φ(Xi, Xj) ≥ t         ≤ exp − nt 2 5σ2+ 3bt ! .

To prove Theorem 1, we establish the almost-sure convergence of µn(A) to µ(A) and of νn(A) to

ν(A) for a subset A ⊂ M with smooth relative boundary. To this aim, we combine upper bounds on

bias terms together with exponential inequalities for U-statistics. The bias terms involve volume bounds which we present next, and integrations over some neighborhoods of the boundary of a regular set, namely tubular neighborhoods or simply tubes, which comes after that.

2.1

Volume bounds

For any r > 0, define

Mr ={x ∈ M : dist(x, ∂M) ≥ r}. (2.3)

The following two lemmas provide bounds on the volume of the intersection of balls with some subsets of M.

(9)

Lemma 5. Let R be a bounded open subset of Rdwith reach(∂R) = ρ > 0. Set A = R ∩ M. For any

r < min{reach(∂M); ρ}, any 0 ≤ η ≤ 1, and all p in ∂A ∩ Mr, we have

Vold  B(p + ηrep, r) ∩ Ac  − πd(η)rd ≤ 2ωd−1r d+1/ρ,

where epdenotes the unit normal vector at p pointing inward A.

Proof. For ease of notation, set B = B(p + ηrep, r). Let (˜e1, . . . , ˜ed) be an orthonormal frame at p,

with ˜ed = ep. Denote by ˜x1, . . . , ˜xd the local coordinates in this frame, such that p has coordinates

0. Then ∂A ∩ M can be expressed locally as the set of points ˜x such that ˜xd = F( ˜x1, . . . , ˜xd−1) for some function F, and, if we set ˜x(d) = ( ˜x1, . . . , ˜xd−1), then

Vold(B ∩ Ac) = Z B 1{ ˜xd < F( ˜x(d))}d ˜x = Z B h 1{ ˜xd < F( ˜x(d))}1{ ˜xd < 0} + 1{ ˜xd < F( ˜x(d))}1{ ˜xd > 0}id ˜x Since πd(η)rd = Z B 1{ ˜xd < 0}d ˜x it follows that vold(Bn∩ A c) − π d(η)rnd ≤ Z B h 1{ ˜xd > F( ˜x(d))}1{ ˜xd < 0} + 1{ ˜xd < F( ˜x(d))}1{ ˜xd > 0}id ˜x ≤ Z Bn 1n| ˜xd| ≤ |F( ˜x(d))|od ˜x ≤ 2 Z {k ˜x(d)k≤r} |F( ˜x(d))|d ˜x(d).

Expanding F at 0, we have, for all ˜x with k ˜xk ≤ r,

F( ˜x(d)) =

d−1

X

i, j=1

Gi j(ξ) ˜xi˜xj,

for some ξ := ξ( ˜x(d)). Since the reach bounds the principal curvatures by 1/ρ [20], we have

supp∈∂A∩MrkG(p)k ≤ 1/ρ. Then, using the change of variable u = r ˜x, we deduce that vold  B(p + ηrep, r) ∩ Ac  − πd(η)rdn ≤ 2ωd−1 sup p∈∂A∩M kG(p)krd+1 ≤ 2ωd−1rd+1/ρ. 

Lemma 6. There exists some constant C > 0 such that, for all r, α satisfying 0 < 2r ≤ α ≤

reach(∂M), and all x in M,

Vold(B(x, α) ∩ Mr) ≥ Cαd.

Proof. The main argument is to include a ball of radius α/4 into B(x, α) ∩ Mr. We can proceed

the following way. First, because ρ := reach(∂M) > 0, for any x ∈ M there is y ∈ M such that

x ∈ B(y, ρ) ⊂ M. Second, since dist(y, ∂M) ≥ ρ and ρ ≥ 2r, we have y ∈ Mr and B(y, ρ − r) ⊂ Mr.

Hence

B(x, α) ∩ B(y, ρ − r) ⊂ B(x, α) ∩ Mr.

If y = x, the result is trival. Otherwise, let z := x + (r + α/4)(y − x)/ky − xk and note that B(z, α/4)

(10)

2.2

Integration over tubes

We introduce the notion of tubes and some of their properties; see [22] for an extensive treatment. Let S be a submanifold of Rd. The tubular neighborhood of radius r > 0 about S , denoted V(S , r),

is the set of points x in Rd for which there exists s ∈ S with kx − sk < r and such that the line

joining x and s is orthogonal to S at s. When S is without boundary, V(S , r) coincides with the set of points x in Rd at a distance no more than r from S . If S has boundary, then the tube coincides

with the set of points at distance no more than r, with the ends removed, corresponding to the points projecting onto ∂S . Assume S is of codimension 1, and oriented, and define epas the (unit)

normal vector of S at p ∈ S . When r < reach(S ), V(S , r) admits the following parameterization

V(S , r) = {x = p + tep: p ∈ S , −r ≤ t ≤ r}.

Denote by IIpthe second fundamental form of S at p ∈ S . The infinitesimal change of volume

function is defined on S × (−r; r) by ϑ(p, t) = det(I − tIIp); the dependence of ϑ on S is omitted.

Given an integrable function g on V(S , r), we have:

Z V(S ,r) g(x)dx = Z S Z r −r g(p, t)ϑ(p, t)dt vσ(dp),

where vσ is the Riemannian volume measure on S .

Lemma 7. Assume S is a submanifold of Rd of codimension 1, with ρ := reach(S ) > 0. Then, for

all r < ρ, sup p∈S sup −r≤t≤r ϑ(p, t) ≤ (1 + r/ρ)d−1, and sup p∈S sup −r≤t≤r |ϑ′(p, t)| ≤ (d − 1)(1 + r/ρ) d−1 ρ − r

where ϑis the derivative of ϑ with respect to t.

Proof. By [20, Thm. 4.18], the reach bounds the radius of curvature from below so that the

prin-cipal curvatures κ(1), . . . , κ(d−1) (the eigenvalues of the second fundamental form) are everywhere

bounded (in absolute value) from above by 1/ρ. Therefore, for r < ρ and −r ≤ t ≤ r, 0 ≤ ϑ(p, t) = det(I − tIIp) = d−1 Y i=1  1 − κ(i)p t≤ (1 + r/ρ)d−1.

For the derivative of ϑ, we have

ϑ′(p, t) ϑ(p, t) =− d−1 X i=1 κ(i)p 1 − κ(i)p t . Hence |ϑ′(p, t)| ≤ ϑ(p, t)(d − 1) 1/ρ 1 − r/ρ(d − 1)(1 + r/ρ)d−1 ρ − r . 

(11)

The celebrated Weyl’s tube formula [40] provides fine estimates for the volume of a tubular region around a smooth submanifold of Rd. We only require a rough upper bound of the right order of magnitude, which we state and prove here.

Lemma 8. For any bounded open subset R ⊂ Rd with reach(∂R) = ρ > 0 and any 0 < r < ρ,

Vold(V(∂R, r)) ≤ 2dVold−1(∂R) r.

In particular, Lemma 8 implies

µ [V(∂M, r)] ≤ Cr, ∀r < reach(∂M), (2.4) where C is a constant depending only on M.

Proof. Using the uniform bound of the infinitesimal change of volume given in Lemma 7, we have

Vold[V(∂R, r)] = Z ∂R Z r −r ϑ(p, u)du vσ(dp) ≤ Vold−1(∂R) 2r(1 + r/ρ)d−1 ≤ 2dVold−1(∂R) r. 

2.3

Bounds on bias terms

Recall the definition of Mrin (2.3).

Lemma 9. Let φA,r be defined as in (2.1). There exists a constant C, depending only on M, such

that, for any A ⊂ M and r < reach(∂M),

τM ωdrd E φA,r(X1, X2) − µ(A) ≤ µ(A ∩ Mc r).

Proof. Assume without loss of generality that τM = 1. We first note that

E φA,r(X1, X2) = E [1A(X1)1{kX1− X2k ≤ r}] .

We partition A into A ∩ Mr and A ∩ Mrc. By conditioning on X1, we have

E 1A∩Mr(X1)1{kX1− X2k ≤ r} =

ωdrdµ(A ∩ Mr) = ωdrdµ(A) − ωdrdµ(A ∩ Mcr);

Eh1A∩Mc

r(X1)1{kX1− X2k ≤ r}

i

≤ ωdrdµ(A ∩ Mrc).

Hence the result. 

Lemma 10. Let A = R ∩ M, where R is a bounded domain with smooth boundary and reach(∂R) =

ρ > 0. Let ¯φA,rbe defined as in (2.2).

(i) There exists a constant C, depending only on M, such that, for any A ⊂ M and r < min{ρ/2, reach(∂M)}, τM γdrd+1 E φ¯A,r(X1, X2) − ν(A) ≤ C Vold−1(∂R ∩ V(∂M, r)) + Vold−1(∂R ∩ M) r ρ ! .

(12)

(ii) There exists a constant C, depending only on M, such that, for any A ⊂ M and r < min{ρ/2, reach(∂M)},

τM

γdrd+1

E φ¯A,r(X1, X2) − Vold−1(∂A ∩ Mr)

Vold(M)

≥ −Cν(A)r

ρ. (2.5)

Proof. Assume without loss of generality that τM = 1. Let S denote ∂R ∩ M. Then

E φ¯A,r(X1, X2) = E [1A(X1)1Ac(X2)1 {kX1− X2k ≤ r}] = Z D Vold[B(x, r) ∩ Ac] µ(dx), where D = {x ∈ A : dist(x, ∂R) ≤ r} .

Since r < ρ, the projection on ∂R is well-defined on D, and any x in D can be written as x = p +tep,

for p ∈ ∂R, and with epthe unit normal vector of ∂R at p pointing inwards.

We partition D into D ∩ Mrand D ∩ Mrc. Denote by Srthe projection of D ∩ Mron S . We have

Z D∩Mr Vold[B(x, r) ∩ Ac] dx = Z Sr Z 0 −r Vold h B(p + tep, r) ∩ Ac i ϑ(p, t)dt vσ(dp) = r Z Sr Z 1 0 Vold h B(p − ηrep, r) ∩ Ac i ϑ(p, rη)dη vσ(dp). Therefore 1 rd+1 Z D∩Mr Vold[B(x, r) ∩ Ac] dx − γdν(A) (2.6) ≤ 1 rd Z Sr Z 1 0 Vold h B(p − ηrep, r) ∩ Ac i − πd(η)rd ϑ(p, rη)dη vσ(dp) + Z Sr Z 1 0 πd(η)ϑ(p, rη)dη vσ(dp) − γdν(A) .

Lemma 5 provides the inequality

Vold h B(p − ηrep, r) ∩ Ac i − πd(η)rd ≤ 2ωd−1r d+1/ρ, and the

first inequality of Lemma 7 states that supp∈Ssup−r≤t≤tϑ(p, t) ≤ (1 + r/ρ)d−1. Since r < ρ,

supp∈S sup0≤η≤1ϑ(p, ηr) ≤ 2d−1. Hence, the first term on the right-hand side is bounded by

d−1(r/ρ) Z Sr Z 1 0 ϑ(p, rη)dη vσ(dp) ≤ 2dωd−1(r/ρ) Vold−1(Sr).

To bound the second term, a Taylor expansion leads to the relation ϑ(p, rη) = 1 + ϑ(p, rξ η)rη

for some 0 < ξη < 1. The second inequality of Lemma 7 states that supp∈S sup−r≤t≤r|ϑ ′

(p, t)| ≤ (d − 1)(1 + r/ρ)d−1/(ρ − r) so that supp∈Ssup0≤η≤1|ϑ′

(13)

Recall that the constant γdis expressed as γd =

R1

0 πd(η)dη. Then the second term in the right-hand

side of (2.6) is bounded by Z Sr Z 1 0 πd(η)dη vσ(dp) − γdν(A) + r Z Sr Z 1 0 ηπd(η)|ϑ ′ (p, rξη)|dη vσ(dp) ≤ γd|Vold−1(Sr) − Vold−1(S )| + (d − 1)2dγd(r/ρ) Vold−1(Sr)

≤ γdVold−1(S ∩ Mrc) + (d − 1)2 dγ

d(r/ρ) Vold−1(Sr),

where we have used the fact that S \Sr ⊂ Mrcsince S ∩ Mr ⊂ Sr. Collecting terms, the term in (2.6)

is bounded by

γdVold−1(S ∩ Mrc) + C

r

ρ Vold−1(Sr),

for some constant C independent of M.

For the integral over D ∩ Mrc, since D is included in the intersection of tubes of radius r about

∂R and ∂M, i.e., D ⊂ V(∂R, r) ∩ V(∂M, r), we have

Z D∩Mc r Vold[B(x, r) ∩ Ac] dx ≤ Z ∂R∩V(∂M,r) Z 0 −r Vold h B(p + tep, r) ∩ Ac i ϑ(p, t)dt vσ(dp) = r Z ∂R∩V(∂M,r) Z 1 0 Vold h B(p − ηrep, r) ∩ Ac i ϑ(p, rη)dη vσ(dp) ≤ 2d−1ω drd+1Vold−1(∂R ∩ V(∂M, r)),

where we have used Lemma 7 again to bound |ϑ(p, rη)| by (1 + r/ρ)d−1≤ 2d−1in the last inequality.

Combining the two inequalities on the integrals over D ∩ Mrand D ∩ Mrc, we obtain that

1 γdrd+1 E φ¯A,r(X1, X2) − ν(A) ≤ Vold−1(S ∩ Mcr) + C r ρVold−1(Sr) + 2 d−1ω dVold−1(∂R ∩ V(∂M, r)) ≤ C Vold−1(∂R ∩ V(∂M, r)) + Vold−1(S ) r ρ ! ,

which proves the first bound stated in Lemma 10.

To prove (ii), using the bound on (2.6), we deduce that 1 γdrd+1 E φ¯A,r(X1, X2) 1 γdrd+1 Z D∩Mr Vold[B(x, r) ∩ Ac] dx ≥ Vold−1(S ) − " Vold−1(S ∩ Mrc) + C γd r ρVold−1(Sr) # ≥ Vold−1(S ∩ Mr) − C r ρ Vold−1(Sr),

(14)

2.4

Exponential inequalities

Proposition 11. Fix a sequence rn → 0. Let A ⊂ M be an arbitrary open subset of M. There exists

a constant C depending only on M such that, for any ε > 0, and all n large enough, we have

P n(A) − µ(A)| ≥ ε ≤ 2 exp nr

d nε2

C(1 + ε)

! .

In particular, if nrdn/ log n → ∞, then µn(A) converges almost surely to µ(A) when n → ∞.

Proof. By the triangle inequality, we have

n(A) − µ(A)| ≤ µn(A) − En(A)  + E µn(A) − µ(A) .

For all n large enough such that rn ≤ reach(∂M), the second term on the right-hand side (the bias

term) is bounded by Crn with C depending only on M. Indeed, Lemma 9 states that the bias is

lower than µ(A ∩ Mc

rn). And the tubular neighborhood of ∂M of radius rn, which contains A ∩ Mrnc,

has a volume bounded by Crnby (2.4).

Assume that n is large enough such that 2Crn ≤ ε. We then apply Theorem 4, which is

Hoeffding’s Inequality for U-statistics, to the first term (the deviation term) on the right-hand side with the kernel

φ := φA,rn − EφA,rn(X1, X2)

and t = ωdrdε/2. The kernel satisfies kφk∞≤ 1, and simple calculations yields

Var(φ(X1, X2)) ≤ E

h

φA,rn(X1, X2)2

i

≤ µ(A)ωdrndM ≤ ωdrdnM.

From this we obtain the large deviation bound. The almost sure convergence is then a simple

consequence of the Borel-Cantelli Lemma. 

Proposition 12. Fix a sequence rn → 0. Let A be an open subset of M with smooth relative

boundary and positive reach. There exists a constant C depending only on M such that, for any

ε > 0, and for all n large enough, we have

P[|νn(A) − ν(A)| ≥ ǫ] ≤ 2 exp nr

d+1 n ǫ2 C(ν(A) + ǫ) ! . In particular, if nrd+1 n / log n → ∞, then

νn(A) → ν(A), n → ∞, almost surely.

Proof. By the triangle inequality, we have

n(A) − ν(A)| ≤ |νn(A) − E [νn(A)]| + |E [νn(A)] − ν(A)| .

Using the control on the bias in Lemma 10-(i), the second term on the right-hand side goes to 0 as

n → ∞. Then for n large enough, we apply Hoeffding’s inequality of Theorem 4 to the first term

on the right-hand side with the kernel

(15)

and t := γdrd+1ν(A)ǫ/2. The kernel satisfies kφk∞ ≤ 1, hence

Var(φ(X1, X2)) ≤ Eh ¯φA,rn(X1, X2)2

i = E¯

φA,rn(X1, X2) ≤ 2γdν(A)rnd+1M,

where the last inequality follows from upper bound on the bias of Lemma 10-(i) for n large enough. From this we obtain the large deviation bound, and the almost sure convergence is a consequence

of the Borel-Cantelli Lemma. 

2.5

Proof of Theorem 1

The first statement of Theorem 1 is an immediate consequence of the exponential inequalities of Propositions 11 and 12.

To prove the second statement, under the conditions of Theorem 1, for any subset A with smooth relative boundary, with probability one limnhn(A) = h(A; M) while hn(A) ≥ γωdrnd H(Gn,rn),

so that lim supn ωd

γdrnH(Gn,rn) ≤ h(A; M). Then we obtain the upper bound of Theorem 1 by taking

the infimum over all such subsets A.

3

Proof of Theorems 2 and 3: consistent estimation

Consistent estimation in the context of Theorem 2 is possible because the class Rn is sufficiently

rich as to include sets that approach Cheeger sets of M and its complexity is controlled, so as to allow for a uniform convergence both in terms of discrete volume and discrete perimeter. This con-trol on the complexity of Rnwe exploit in building a covering for Rn, which is done in Section 3.1,

later used to obtain uniform versions of Propositions 11 and 12. Then Part (i) of Theorem 2, which states the convergence of a penalized graph Cheeger constant towards the Cheeger constant of M, is proved in Section 3.7. Finally, Part (ii), which characterizes the accumulation points of a se-quence of minimizing sets, is proved in Section 3.8. The convergence of the discrete measures associated with a sequence of minimizing sets (Theorem 3) is proved in Section 3.9.

3.1

Covering numbers

For ρ > 0, let Rρbe the class of open subsets R ⊂ (0, 1)d with reach(∂R) ≥ ρ. Let dH(R, R′) be the

Hausdorff distance between two sets R and R′, i.e.,

dH(R, R

) = inf {r > 0 : R ⊂ R⊕ B(r) and R⊂ R ⊕ B(r)} .

Denote by Nε, Rρ, dH



be the covering number of Rρfor the Hausdorff distance, i.e., the minimal

number of balls of radius ε for the Hausdorff distance, centered at elements in Rρ that are needed

to cover Rρ.

Lemma 13. (i) There exists a constant C depending only on d such that, for any ε > 0 and any

ρ > 0: log N (ε, Rρ, dH) ≤ C 1 ε !d .

(16)

(ii) If 0 < ε < ρ, then for any R and Rin Rρ, if dH(R, R) ≤ ε, then R∆R⊂ V(∂R, ε) ∩ V(∂R, ε).

Proof. Let x1, . . . , xn be an ε-packing of (0, 1)d, so ∪ni=1B(xi, ε) covers (0, 1)d and n ≤ Cε−d for

some constant C depending only on d. For any set R in Rρ, define

Iε(R) = {i = 1, . . . , n : B(xi, ε) ∩ R , ∅} .

Then clearly, by definition of the covering, R ⊂ ∪i∈Iε(R)B(xi, ε), and

i∈Iε(R)B(xi, ε) ⊂ R ⊕ B(2ε).

Therefore

dHi∈Iε(R)B(xi, ε), R ≤ 2ε.

Since when R ranges in Rρ, the cardinality of sets of the form ∪i∈Iε(R)B(xi, ε) is bounded by 2

n, then

the collection of Hausdorff balls of radius 2ε and centered set of the form ∪i∈IB(xi, ε), where I is

any subset of {1, . . . , n}, covers Rρ. By doubling the radius of the balls, we can take centers in Rρ,

which proves the first part of the lemma.

The second part follows from the fact that if reach(∂R) > ρ, then ∂R ⊕ B(ρ) = V(∂R, ρ),

assuming, without loss of generality, that ∂R has no boundary. 

We mention that the bound on the ε-entropy of Rρ is rather weak. Standard results by

Kol-mogorov and Tikhomirov [25] suggest a bound of the form C(ρε)−(d−1)/2. Such a result would change the exponent for rnin Theorem 2 to (3d + 1)/2.

3.2

Perimeter bounds of a regular set

The classical isoperimetric inequality provides a bound of the volume of a Borel set R in terms of its perimeter (see e.g., Evans and Gariepy, 1992):

dω1/dd Vold(R)1−1/d ≤ Vold−1(∂R). (3.1)

But, in the case where ∂R has positive reach, the perimeter may in turn be bounded by the volume, as stated in Lemma 14 below. The proof uses the following inequality: for every Borel sets R, S

Vold−1 ∂(R ∪ S ) + Vold−1 ∂(R ∩ S ) ≤ Vold−1(∂R) + Vold−1(∂S ). (3.2)

Lemma 14. Let R be a bounded open subset of Rd with reach(∂R) = ρ > 0. Then,

Vold−1(∂R) ≤ d Vold(R)/ρ.

Proof. Since reach(∂R) = ρ > 0, a ball of radius ρ rolls freely in R. Consequently R can be written

as a countable union of balls of radius ρ, i.e.,

R =

∞ [

i=1

(17)

Set Rn =∪ni=1Bi where Bi= B(xi, ρ).

Using the decomposition Rn+1 = Rn∪ Bn+1, on the one hand we have

Vold(Rn+1) = Vold(Rn∪ Bn+1) = Vold(Rn) + ωdρd− Vold(Rn∩ Bn+1),

and on the other hand, using inequality (3.2), we have

Vold−1(∂Rn+1) = Vold−1(∂(Rn∪ Bn+1)) ≤ Vold−1(∂Rn) + dωdρd−1− Vold−1(∂(Rn∩ Bn+1)).

Consequently Vold−1(∂Rn+1) − d ρVold(Rn+1) ≤ Vold−1(∂Rn) − d ρVold(Rn) +" d ρVold(Rn∩ Bn+1) − Vold−1(∂(Rn∩ Bn+1)) # .

But, using the isoperimetric inequality (3.1), we may write

d ρVold(Rn∩ Bn+1) − Vold−1 ∂(Rn∩ Bn+1)  ≤ d ρ Vold(Rn∩ Bn+1) − dω 1/d d  Vold(Rn∩ Bn+1) 1−1/d ≤  Vold(Rn∩ Bn+1) 1−1/dd ρVold(Rn∩ Bn+1) 1/d − dω1/d d  ≤ 0

since, in the last bracket, Vold(Rn∩ Bn+1) ≤ Vold(Bn+1) = ωdρd. Therefore, for all n ≥ 1, we have

Vold−1(∂Rn+1) −

d

ρVold(Rn+1) ≤ Vold−1(∂Rn) −

d

ρVold(Rn).

But since R1is a ball of radius ρ, we have Vold−1(∂R1) − d Vold(R1)/ρ = 0 and so

Vold−1(∂Rn) −

d

ρVold(Rn) ≤ 0 for all n ≥ 1.

Since Rn converges to R in L1, it follows from the lower semi-continuity of the perimeter, see

e.g. [23, Prop. 2.3.6], that lim infnVold−1(∂Rn) ≥ Vold−1(∂R). This concludes the proof. 

3.3

Exponential inequalities

We prove the uniform versions of Propositions 11 and 12 for the class Rρ.

Proposition 15. There exists a constant C depending only on M such that, for any ε, r > 0 and all

n satisfying nrdρdεd+2 > C and ε > Cr, we have

P      sup R∈Rρ |µn(R) − µ(R)| ≥ ε      ≤ 2 exp − nrdε2 C(1 + ε) ! . (3.3)

(18)

Proof. The bias term is dealt exactly as in Proposition 11, obtaining E µn(R) − µ(R) ≤ C0r,

valid for all R ∈ Rρ, so assuming ε > 2C0r, we may focus on bounding the variance term µn(R) − En(R) .

Define the kernel class

F = {φR,r : R ∈ Rρ}, (3.4)

where φR,r is defined in (2.1). Let Un(φ) be the U-process over F defined by

Un(φ) = 1 n(n − 1) X i, j φ(Xi, Xj). Observe that sup R∈Rρ µn(R) − En(R)  = τM ωdrd sup φ∈F Un(φ) − µ ⊗2 (φ) .

Consider a minimal covering of Rρ of cardinal K by balls centered at elements R1, . . . , RK of Rρ,

and of radius η < ρ for the Hausdorff distance. By Lemma 13, log(K) ≤ C1(1/η)d.

For any R in Rρ, there exists 1 ≤ k ≤ K such that dH(R, Rk) ≤ η, which implies that R∆Rk

V(∂Rk, η). Also, by Lemma 8, there exists a constant C2 depending only on the dimension d such

that Vold(V(∂Rk, η)) ≤ C2η/ρ, for all 1 ≤ k ≤ K, which implies that

µ (V(∂Rk, η)) ≤ C3η/ρ, for all 1 ≤ k ≤ K,

since η < ρ, and where C3 now depends on M.

We have φR,r(x, y) − φRk,r(x, y) = 1 2 1R(x) + 1R(y) − 1Rk(x) − 1Rk(y) 1 {k x − yk ≤ r} ≤ 1 2 1R∆Rk(x) + 1R∆Rk(y) 1 {kx − yk ≤ r} . Next, consider the inequality

UnR,r) − µ ⊗2 (φR,r) ≤ UnR,r) − UnRk,r) + UnRk,r) − µ ⊗2 (φRk,r) + µ ⊗2 (φRk,r) − µ⊗2(φR,r) .

For the double expectations, we have,

µ ⊗2 (φRk,r) − µ⊗2(φR,r) ≤ µ ⊗2 φRk,r− φR,r = E1R∆Rk(X1)1 {kX1− X2k ≤ r} = Z R∆Rk µ (B(x, r)) µ(dx) ≤ Z V(∂Rk,η) µ (B(x, r)) µ(dx) ≤ ωdr d τM µ (V(∂Rk, η)) ≤ C4rdη/ρ,

(19)

with C4 still depending only on M. The last inequality is a consequence of Lemmas 8 and 14, and

the fact that Vold(Rk) ≤ 1 since Rk ⊂ (0, 1)d.

For the empirical averages, we have

UnR,r) − UnRk,r) ≤ 1 2 1 n(n − 1) X i, j  1R∆Rk(Xi) + 1R∆Rk(Xj)  1nkXi− Xjk ≤ r o ≤ 1 2 1 n(n − 1) X i, j  1V(∂Rk,η)(Xi) + 1V(∂Rk,η)(Xj)  1nkXi− Xjk ≤ r o = Un  φV(∂Rk,η)  . Therefore, sup R∈Rρ UnR,r) − µ ⊗2 R,r) ≤ max 1≤k≤KUn  φV(∂Rk,η)  + C4 rdη ρ + max1≤k≤K UnRk,r) − µ ⊗2 Rk,r) .

Consequently, for any ε > 0, we may write

P      sup R∈Rρ µn(R) − En(R)  ≥ ε       = P      sup φ∈F Un(φ) − µ ⊗2(φ) ≥ ωdrdε τM       ≤ P max 1≤k≤KUn  φV(∂Rk,η)  ≥ ωdr dεM − C4 rdη ρ ! + P max 1≤k≤K UnRk,r) − µ ⊗2 (φRk,r) ≥ ωdrdε 2τM ! ≤ K max 1≤k≤K P UnφV(∂R k,η)  ≥ ωdr dεM − C4 rdη ρ ! + K max 1≤k≤K P UnRk,r) − µ ⊗2 Rk,r) ≥ ωdrdε 2τM ! ,

by the union bound. To bound the first term, note first that VarφV(∂Rk,η)(X1, X2)  ≤ EhφV(∂Rk,η)(X1, X2) 2i ≤ EhφV(∂Rk,η)(X1, X2) i , with EhφV(∂R k,η)(X1, X2) i ≤ ωdr d τM µ (V(∂Rk, η)) ≤ C4 rdη ρ ,

for the same reasons as above. Now take η = ρ min(ωdε/(8CM), 1). Then, for any 1 ≤ k ≤ K, by

Hoedffding’s inequality for U-statistics (Theorem 4), we have,

P UnφV(∂R k,η)  ≥ ωdr dεM − C4 rdη ρ ! ≤ P Un  φV(∂Rk,η)  − EhUn  φV(∂Rk,η) i ≥ ωdr dεM ! ≤ exp − n(ωdr dε/4τ M)2 5(C4rdη/ρ) + 3(ω drdε/4τM) ! ≤ exp −nr dε C5 ! ,

(20)

for a constant C5 > 0 depending only on M. To bound the second term, since

Var φRk,r(X1, X2) ≤ E φRk,r(X1, X2) ≤ ωdrdM,

we may apply Lemma 4 again to obtain the bound

P UnRk,r) − µ ⊗2 (φRk,r) ≥ ωdrdε 2τM ! ≤ exp − n(ωdr dε/2τ M)2 5ωdrd+ 3(ωdrdε/2τM) ! ≤ exp − nr dε2 C6(1 + ε) ! ,

for a constant C6 > 0 depending only on M.

With the choice of η as above, the cardinal K of the covering is such that log(K) ≤ C7(ερ)−d,

for some constant C7depending only on M, and we obtain the bound

P      sup R∈Rρ µn(R) − En(R)  ≥ ε       ≤ K expnr dε C5 ! + K expnr dε2 C6(1 + ε) ! ≤ 2 exp C7(ερ)−dnrdε2 C8(1 + ε) ! ≤ 2 exp − nr dε2 C9(1 + ε) ! , if nrdεd+2ρd > C

9, for a constant C9depending only on M. 

For the perimeter, we only control the variance, as the bias may not be controlled uniformly over Rρ. Indeed, consider the case where M is a hypercube with rounded corners so as to satisfy

the condition on its reach, and let R be another hypercube with rounded corners included in M sharing one of its faces with M. Then given a sample X1, . . . , Xn, it is possible to translate R inside

M just enough that the translate does not share a boundary with M, while its discrete volume and

perimeter are left equal to those of R.

Proposition 16. There exists a constant C depending only on M such that, for any ε > 0, ρ < 1,

r < min(reach(M), ρ/2) and all n satisfying nr2d+1ρd+1εd+2> C, we have

P      sup R∈Rρ |νn(R) − E[νn(R)]| ≥ ε      ≤ 2 exp − nrd+1ρε2 C(1 + ρε) ! .

Proof. The proof follows that of Proposition 15, with the symmetric kernel ¯φR,r defined in (2.2)

and the class ¯F defined in (3.4) with φR,rreplaced by ¯φR,r. Observe that

n(R) − E [νn(R)]| = τM γdrd+1 sup φ∈ ¯F Un(φ) − µ ⊗2 (φ) .

(21)

As in the proof of Proposition 15, we start with a minimal covering of Rρ of cardinal K by balls

of radius η for the Hausdorff distance. For any R in Rρat a Hausdorff distance no more than η of

an element Rkof the covering, we have

1R(x)1Rc(y) − 1Rk(x)1Rck(y) 1R(x) − 1Rk(x) 1Rc(y) + 1Rk(x) 1Rc(y) − 1Rck(y) = 1R∆Rk(x)1Rc(y) + 1R∆Rk(y)1Rk(x) ≤ 1R∆Rk(x) + 1R∆Rk(y). Hence, φ¯R,r(x, y) − ¯φRk,r(x, y) ≤ 2φR∆Rk,r(x, y) ≤ 2φV(∂Rk,η)(x, y),

and therefore, following the same arguments,

µ ⊗2( ¯φ R,r) − µ⊗2( ¯φRk,r) ≤ 2µ ⊗2 V(∂Rk,η)) ≤ C1r dη/ρ,

for a constant C1depending only on M; and also,

Un φ¯R,r − Un φ¯Rk,r  ≤ 2Un  φV(∂Rk,η)  . Hence P      sup R∈Rρ |νn(R) − E [νn(R)]| ≥ ε       ≤ K max 1≥k≥K P UnφV(∂R k,η)  ≥ γdr d+1εM − C1 rdη 2ρ ! + K max 1≥k≥K P Un( ¯φRk,r) − µ ⊗2 ( ¯φRk,r) ≥ γdrd+1ε 2τM ! .

Take η = ρ min γdrε/(4C1τM), 1. For the first term, for any 1 ≤ k ≤ K, we have,

P UnφV(∂R k,η)  ≥ γdr d+1εM − C1 rdη 2ρ ! ≤ P Un  φV(∂Rk,η)  − EhUn  φV(∂Rk,η) i ≥ γdr d+1εM ! ≤ exp − n(γdr d+1ε/8τ M)2 5(C1rdη/ρ) + 3(γdrd+1ε/8τM) ! = exp − nr d+1ε2 C2(1 + ε) ! ,

for some constant C2 > 0 depending only on M. For the second term, since by Lemma 10, when r ≤ ρ/2,

Var ¯φRk,r ≤ C3rd+1/ρ,

for a constant C3depending only on M, we have

P Un( ¯φRk,r) − µ ⊗2( ¯φ Rk,r) ≥ γdrd+1ε 2τM ! ≤ exp −ndr d+1ε/2τ M)2 5(C3rd+1/ρ) + 3(γ drd+1ε/2τM) ! ≤ exp − nr d+1ρε2 C4(1 + ρε) ! ,

(22)

for a constant C4 > 0 depending only on M. Finally, with the choice of η as above, the cardinal K

of the covering is such that log(K) ≤ C5(rρε)−d, for C5depending only on M.

Then K max 1≥k≥K P UnφV(∂R k,η)  ≥ γdr d+1εM − C1 rdη ρ ) ! ≤ exp − nr d+1ε2 C6(1 + ε) ! , if nr2d+1ρdεd+2 > C 6, and K max 1≥k≥K P Un( ¯φRk,r) − µ ⊗2( ¯φ Rk,r) ≥ γdrd+1ε 2τM ! ≤ exp − nr d+1ρε2 C7(1 + ρε) ! , if nr2d+1ρd+1εd+2 > C

7, where C6 and C7 depend on M only. Combining these inequalities, we

conclude. 

3.4

A uniform control on h

n

(A)

As we argued earlier, the boundary of M makes a uniform convergence of the perimeters of sets in

Rnimpossible. Our way around that is to compare the discrete perimeter of a set R with its

perime-ter inside Mrn, thus avoiding the boundary of M, i.e., Vold−1(∂R ∩ Mrn), leading to a comparison

between hn(R) and h(R; Mrn). We relate the latter to h(R; M) in Section 3.5.

Lemma 17. Under the conditions of Theorem 2, with probability one, we have:

lim inf

n→∞ R∈Rninf hn(R) − h(R; Mrn) ≥ 0. (3.5)

Proof. Take R ∈ Rnand define

λn(R) = min(µn(R), µn(Rc)), λ ∗ n(R) = 1 τM min(Vold(R ∩ Mrn), Vold(Rc ∩ Mrn)), as well as ν∗n(R) = 1 τM Vold−1(∂R ∩ Mrn). Then hn(R) − h(R; Mrn) = 1 λn(R)n(R) − νn(R)) + ν∗ n(R) λn(R)λn(R) (λ∗n(R) − λn(R)) =: ζn(R) + ξn(R).

Define the event

n = ( 1 2 ≤ λn(R) λ∗ n(R) ≤ 3 2, ∀R ∈ Rn ) .

(23)

Bounding ζn(R). By definition of Rn, the sets R and Rc contain each a ball of radius ρn, and by

Lemma 6, the volume of the intersection of this ball with Mrn is bounded from below by Cdn, for

a constant C1depending only on M. Hence,

λ∗n(R) ≥ Cdn. (3.6)

Also, on Ωn, λn(R) ≥ λn(R)/2. These last two inequalities being valid for all R ∈ Rn, for ε > 0 we

have I1 := P "" inf R∈Rnζn(R) < −ε # ∩ Ωn # ≤ P " inf R∈Rn νn(R) − νn(R) < −C2ερdn # ≤ P " inf R∈Rn νn(R) − E[νn(R)] + infR∈Rn En(R)] − νn(R) < −C2ερdn # ,

for a constant C2 = C1/2 > 0. Using the bias bounds of Lemma 10 together with the perimeter

bound in Lemma 14(ii), we have inf R∈Rn En(R)] − νn(R) ≥ −C3 rn ρ2 n .

Hence, since rn = o(ραn) for any α > 0, for ε fixed and n large enough, we have by assumption, for

all n large enough,

I1 ≤ P " inf R∈Rnn(R) − E[νn(R)]) < −C2ερ d n/2 # ≤ P      sup R∈Rρnn(R) − E[νn(R)]| > C2ερdn/2      ,

where the second inequality comes from the fact that Rn⊂ Rρn. By the fact that nr

2d+1

n ραn → ∞ for

any α > 0, the conditions of Proposition 16 are satisfied, so that

I1 ≤ C4exp −

nrnd+1ρ2d+1n ε2

C4(1 + ε)

! ,

for some constant C4 > 0 and all n large enough. At last, we have

nrd+1 n ρ2d+1n log(n) = nr 2d+1 n ρ2d+1 n r−dn log(n) → +∞,

since rn = o(ραn) for any α > 0 and rn → 0 polynomially in n, we deduce that, for all ε > 0,

X n P "" inf R∈Rnζn(R) < −ε # ∩ Ωn # < ∞. (3.7)

Bounding ξn(R). (We reset the constants, except for C1.) By the perimeter bound of Lemma 14,

we have ν∗n(R) ≤ Vold−1(∂R) τM ≤ dVold(R) τMρn = C2/ρn,

(24)

for a constant C2 > 0 depending only on M. So, together with (3.6) and the fact that, on Ωn, λn(R) ≥ λn(R)/2, ν∗ n(R) λn(R)λn(R) ≤ C−2d−1n ,

for all R in Rn. It follows that

I2 := P "" inf R∈Rnξn(R) < −ε # ∩ Ωn # ≤ P sup R∈Rn λn(R) − λn(R) > ρ2d+1 n ε C3 ! . (3.8) Define µ∗n(R) = Vold(R ∩ Mrn) τM . Then λn(R) − λn(R) ≤ µn(R) − µn(R) + µn(R c ) − µ∗n(Rc) ≤ |µn(R) − µ(R)| + |µn(Rc) − µ(Rc)| + 2µ(Mrnc),

with µ(Mrnc) ≤ C4rnby (2.4). For ε > 0 fixed and n large enough, 2C4rn ≤ ρ2d+1n ε/C3, again by the

fact that ρn→ 0 sub-polynomially in rn. We therefore obtain that

I2 ≤ 2P      sup R∈Rρnn(R) − µ(R)| > ρ2d+1 n ε 4C3      ,

where we used the fact that Rc ∈ Rn when R ∈ Rn, together with Rn ⊂ Rρn. We then apply

Proposition 15, whose conditions are satisfied for ε > 0 fixed and n large enough, again because

ρn → 0 very slowly, arriving at

I2≤ C4exp −

nrdnρ4d+2n ε2

C4(1 + ε)

! ,

for some constant C4 > 0 and all n large enough. As before, when ε is fixed, the exponent is a

positive power of n, so that

X n P "" inf R∈Rnξn(R) < −ε # ∩ Ωn # < ∞. (3.9) Bounding PΩc n. Since λ ∗

n(R) > Cρdn for some C uniformly over R ∈ Rn (see (3.6) above), we

have P c n = P sup R∈Rnn(R) − λn(R)| λ∗ n(R) > 1 2 ! ≤ P sup R∈Rn λn(R) − λn(R) > Cρ d n ! .

We then proceed as in bounding (3.8), obtaining

X

n

(25)

Conclusion. We have P " inf R∈Rn hn(R) − h(R; Mrn) < −2ε # ≤ P "" inf R∈Rnζn(R) < −ε # ∩ Ωn # +P "" inf R∈Rnξn(R) < −ε # ∩ Ωn # +PΩc n ,

so that the left-hand side is summable. Therefore, we conclude by applying the Borel-Cantelli

lemma. 

3.5

Some continuity of the Cheeger constant

Our proof of Theorem 2 relies on continuity properties of the normalized cut and of the Cheeger constant. Lemma 18 below compares the conductance function on M and on a bi-Lipschitz defor-mation of M. For a Lipschitz map f , let k f kLipdenote its Lipschitz constant. If f is bi-Lipschitz,

we define its condition number by cond( f ) := k f kLipk f−1kLip. Lemma 19 below states that Mr is a

bi-Lipschitz deformation of M, hence Lemma 18 yields the continuity property of Proposition 20.

Lemma 18. Let f be a bi-Lipschitz on M. Then for any A ⊂ M measurable,

max( h( f (A); f (M)) h(A; M) , h(A; M) h( f (A); f (M)) ) ≤ cond( f )d.

Proof. For any A ⊂ M, ∂ f (A) = f (∂A) and f (A)c∩ f (M) = f (Ac∩ M), and if A is measurable, for

k = 1, . . . , d,

k f−1k−kLip Volk(A) ≤ Volk( f (A)) ≤ k f kkLip Volk(A).

Therefore,

h( f (A); f (M)) = Vold−1( f (∂A ∩ M))

min{Vold( f (A)), Vold( f (Ac∩ M))}

k f k

d−1

Lip Vold−1(∂A ∩ M)

k f−1k−d

Lipmin{Vold(A), Vold(Ac ∩ M)}

≤ cond( f )dh(A; M).

And vice-versa. 

Lemma 19. Fix r < s ≤ reach(∂M). Then there is a bi-Lipschitz map between Mr and M that

leaves Msunchanged, and with condition number at most (1 + 2r/(s − r))2.

Proof. For x in M such that δ(x) := dist(x, ∂M) < s, let ξ(x) ∈ M be its metric projection onto ∂M

and ux be the unit normal vector of M at ξ(x) pointing outwards. We define the map

fr : Mr7→ M, fr(x) = x +

r(s − δ(x))+

s − r ux,

where a+denotes the positive part of a ∈ R. By construction, f is one-to-one, with inverse

fr−1: M 7→ Mr, fr−1(x) = x −

r(s − δ(x))+

(26)

By [20, Thm. 4.8(1)], δ is Lipschitz with constant at most 1, therefore so is x 7→ (s − δ(x))+; and

since the reach bounds the radius of curvature from below [20, Thm. 4.18], x 7→ ux is Lipschitz

with constant at most 1/ reach(∂M). Therefore, using the fact that (s − δ(x))+ ≤ s and kuxk = 1, fr

and fr−1are Lipschitz with constants at most 1 + 2r/(s − r) and 1 + 2r/s respectively. 

Proposition 20. We have

H(Mr) = (1 + O(r)) H(M), r → 0.

Proof. From Lemmas 18 and 19, we deduce that

max( H(Mr) H(M), H(M) H(Mr) ) ≤ (1 + 2r/(ρM− r))2d,

for any r < ρM := reach(∂M), which immediately yields the desired result. 

3.6

L

1

-metric on Borel sets

We will use the L1-metric on Borel subsets of Rd, defined by Vold(A∆B) =

R

|1A(x) − 1B(x)| dx.

This metric comes from the bijection between Borel sets A and their indicator functions 1A,

en-dowed with the L1-topology. Strictly speaking, this is a semi-metric on Borel subsets of Rd since Vold(A∆B) = 0 if and only if A∆B is a null set.

The following propositions are adapted from [23, Thm. 2.3.10] and [23, Prop. 2.3.6] respec-tively. Proposition 21 is a compactness criterion, and Proposition 22 results from lower semi-continuity of the perimeter measure with respect to L1-metric.

Proposition 21. Let (En) be a sequence of measurable subsets of M. Suppose that

lim sup

n→∞

Vold−1(∂En∩ M) < ∞.

Then (En) admits a subsequence converging for the L1-metric.

Proposition 22. Let Enand E be bounded measurable subsets of M such that En → E in L1, and

h(E; M) < ∞. Then

lim inf

n h(En; M) ≥ h(E; M).

3.7

Proof of (i) in Theorem 2

Lower bound. For each n, let Rn ∈ Rnbe such that

hn(Rn) = min R∈Rnhn(R). Then hn(Rn) − H(M) = h hn(Rn) − h(Rn; Mrn) i +h(Rn; Mrn) − H(Mrn) + H(Mrn) − H(M) ≥ inf R∈Rn hn(R) − h(R; Mrn) + H(Mrn) − H(M) ,

(27)

sinceh(Rn; Mrn) − H(Mrn) ≥ 0 by definition of H(Mrn). On the last line, by Lemma 17, the first

term has a non-negative inferior limit, and by Proposition 20, the second term tends to zero. Hence, lim inf

n→∞ minR∈Rnh

n(R) ≥ H(M) a.s. (3.11)

Upper bound. To obtain the matching upper bound, fix a subset A ⊂ M with smooth relative boundary and such that 0 < Vold(A) ≤ Vold(M\A) < Vold(M). Then, for n large enough, there

exists Rnin Rnsuch that Rn∩ M = A, implying that

min

R∈Rnh

n(R) ≤ hn(A).

By Theorem 1, hn(A) → h(A; M) almost surely, so that

lim sup n→∞ min R∈Rnhn(R) ≤ h(A; M) a.s.

By minimizing over A, we obtain lim sup n→∞ min R∈Rnhn(R) ≤ H(M) a.s. (3.12)

Combining the lower and upper bounds, (3.11) and (3.12), we conclude that lim

n→∞minR∈Rnh

n(R) = H(M) a.s. (3.13)

3.8

Proof of (ii) in Theorem 2

Let Rn be a sequence in Rnsatisfying

hn(Rn) = min R∈Rnh

n(R),

and set An = Rn∩ M. Fix a subset A0 ⊂ M with smooth relative boundary and such that h(A0) < ∞.

Then for n large enough, there exists R in Rn such that A0 = R ∩ M. Hence hn(An) ≤ hn(A0) and

since hn(A0) → h(A0) by Theorem 1, we have

lim sup

n→∞

Vold−1(An) ≤ lim sup n→∞

h(An) min{Vold(An), Vold(Acn∩ M)} ≤ h(A

0) Vol

d(M)/2.

Therefore by compactness of the class of sets with bounded perimeters (Proposition 21), with prob-ability one, {An} admits a subsequence converging in the L1-metric.

On the one hand,

Références

Documents relatifs

The next multicolumns exhibit the results obtained on instances 1.8 and 3.5: “P rimal best ” presents the average value of the best integer solution found in the search tree; “Gap

Based on this new data, we apply state-of-the-art tools in order to partition the graph, on the one hand in terms of community structure, and on the other hand according to the

At Monument Hill the site faces the empty Point State Park and the Fort Duquesne Bridge whereas at Mount Washing- ton the site faces the more visually interesting downtown

In this paper, we give a formal statement which expresses that a simple and local test performed on every node before its construction permits to avoid the construction of useless

By drastically reducing the number of nodes in the graph, these heuristics greatly increase the speed of graph cuts and reduce the memory usage.. Never- theless, the performances

How- ever assuming a prevalence of 7% and an odds ratio of 1.9 for myocardial infarctions as well as an odds ratio of 3.0 for cerebral vascular insults hyperhomocysteine- mia for

n -dof systems, lasting contact phases necessitate a purely inelastic impact law of the form Pu C D 0, leading to a loss of kinetic energy incompatible with the conservative

• Watertight model handling missing data: A 3D space partitioning into volumetric cells is created by using de- tected wall directions.. By limiting the possible results with the