Scalar ﬁeld analysis with aberrant noise

(1)

Scalar field analysis with aberrant noise

Micka¨el Buchet

joint work with F. Chazal, T. Dey, F. Fan, S. Oudot and Y. Wang

(2)

What is scalar field analysis?

How many peaks do you see?

(3)

How many peaks do you see?

(4)

How many peaks do you see?

(5)

Persistence diagram

(6)

Persistence diagram

(7)

Persistence diagram

(8)

Persistence diagram

(9)

Persistence diagram

(10)

Persistence diagram

(11)

Persistence diagram

(12)

Persistence diagram

(13)

Persistence diagram

(14)

Persistence diagram

(15)

Persistence diagram

(16)

Comparison between persistence diagrams

Bottleneck distance:

d_B(D,E) = inf

b∈Bmax

x∈D ||x−b(x)||_∞

(17)

Comparison between persistence diagrams

d_B(D,E) = inf

b∈Bmax

x∈D ||x−b(x)||_∞

(18)

Comparison between persistence diagrams

d_B(D,E) = inf

b∈Bmax

x∈D ||x−b(x)||_∞

(19)

Comparison between persistence diagrams

d_B(D,E) = inf

b∈Bmax

x∈D ||x−b(x)||_∞

(20)

Higher dimensions

(21)

Applicative setting

Theground truth:

I a sub-manifold M of R^d

I a c-Lipschitz functionf :M7→R

What we really know:

I a set of pointsP ∈R^d

I a function ˜f :P 7→R

Approximate the persistence diagramD off by a diagram Dˆ

(22)

Applicative setting

Theground truth:

(23)

Applicative setting

Theground truth:

(24)

Previous work

From Chazal, Guibas, Oudot and Skraba (DCG’11) Under some conditions onand the geometry of M,

Theorem

If P is anRiemannian sample ofM and||f|_P −˜f||_∞≤ξ, then:

d_B(D,D)ˆ ≤4c+ξ

Theorem

If P is anRiemannian sample ofM and the pairwise distances between points of P are known with precisionν, then:

d_B(D,D)ˆ ≤(4+2ν)c

(25)

Previous work

Theorem

d_B(D,D)ˆ ≤4c+ξ

Theorem

d_B(D,D)ˆ ≤(4+2ν)c

(26)

Previous work

Theorem

d_B(D,D)ˆ ≤4c+ξ

Theorem

dB(D,D)ˆ ≤(4+2ν)c

(27)

Previous work

Theorem

d_B(D,D)ˆ ≤4c+ξ

Theorem

dB(D,D)ˆ ≤(4+2ν)c

(28)

A bad example

(29)

A bad example

(30)

A bad example

(31)

Computing the persistence of a scalar field

Filtered simplicial complex

Classical algorithms to compute a persistence diagram work on a filtered simplicial complex.

A simplicial complex is a setX of simplices such that for any simplexσ ∈X, all facets of σ are also in X.

We say thatX is filtered when there exists a family of simplicial complexes{X_α}_α∈_R such that for allα < β,X_α⊂X_β ⊂X.

(32)

Filtered simplicial complex

(33)

Filtered simplicial complex

(34)

The Rips complex

Topologies of{f⁻¹(]− ∞, α)}and {˜f⁻¹(]− ∞, α])}are completely different:

Giving thickness by building a Rips complex:

(35)

The Rips complex

(36)

The Rips complex

(37)

The Rips complex

(38)

The Rips complex

(39)

The Rips complex

(40)

The Rips complex

(41)

Nested Rips complexes

Sometimes, there is no parameter such that the Rips complex capture the topology of M.

Solution: take two different parametersδ < δ⁰ and look at the image ofH∗(R_δ) by the morphism induced by the inclusion Rδ,→R_δ⁰.

Effect of ”cleaning” the persistence diagram.

(42)

Nested Rips complexes

(43)

Nested Rips complexes

(44)

Filtration by functional values

We study the scalar fieldf and not the manifold M.

We work for fixed parametersδ and δ⁰ and we use the values of ˜f to build the filtration:

P_α = ˜f⁻¹(]− ∞, α])

{R_δ(P_α),→R_δ⁰(P_α)}_α∈_R

(45)

Filtration by functional values

P_α = ˜f⁻¹(]− ∞, α])

{R_δ(P_α),→R_δ⁰(P_α)}_α∈_R

(46)

Filtration by functional values

P_α = ˜f⁻¹(]− ∞, α])

{R_δ(P_α),→R_δ⁰(P_α)}_α∈_R

(47)

Theoretical guarantees

From Chazal, Guibas, Oudot and Skraba (DCG’11) Let%(M)be the strong convexity radius of M.

Theorem

If P is anRiemannian sample ofM,||f|P−˜f||_∞≤ξand <¹₄%(M):

∀δ∈[2,1

2%(M)[,dB(Dgm(f),Dgm(Rδ(Pα),→R2δ(Pα)))≤2cδ+ξ

Theorem

If P is anRiemannian sample ofMand the distance between points of P are given by a functiond such that˜ ^d^M^(x,y)_λ ≤d(x˜ ,y)≤ν+µ^d^M^(x,y_λ ⁾, then:

∀δ≥ν+2µ

λ, δ⁰∈[ν+2µδ,1 λ%(M)[, d_B(Dgm(f),Dgm(R_δ(P_α),→R_δ⁰(P_α))≤cλδ⁰

(48)

Theoretical guarantees

Theorem

∀δ∈[2,1

Theorem

∀δ≥ν+2µ

(49)

Theoretical guarantees

Theorem

∀δ∈[2,1

Theorem

∀δ≥ν+2µ

(50)

Functional denoising

Sources of noise

(51)

Sources of noise

(52)

Sources of noise

(53)

Sources of noise

(54)

Sources of noise

(55)

Discrepancy

We compute new functionnal values for points :

For every pointp in P:

1. Build the setNN_k(p) of itsk nearest neighbors.

2. Find the subset Y ofk⁰ values in˜f(NN_k(p))with the smallest variance.

3. Fix the new function value ˆf(p)as the barycenter of Y.

(56)

Discrepancy

(57)

Discrepancy

(58)

Discrepancy

(59)

Discrepancy

(60)

A variant using the median

1. Build the setNNk(p) of itsk nearest neighbors.

2. Fix the new function value ˆf(p)as the median of ˜f(NN_k(p)).

(61)

Asymptotic behaviour

Whenk→ ∞ and ^k_n →0.

Probability distribution around the correct value:

(62)

Asymptotic behaviour

(63)

Asymptotic behaviour

(64)

Asymptotic behaviour

(65)

Asymptotic behaviour

(66)

Asymptotic behaviour

(67)

Asymptotic behaviour

(68)

Asymptotic behaviour

(69)

Asymptotic behaviour

(70)

Asymptotic behaviour

(71)

Experimental illustration

Bone

Noisy input k-NN regression Discrepancy

Max 16.23 3.18 0.37

Mean 0.349 .204 .097

(72)

Persistence of topographic map

Topographic map

Noisy topographic map Noisy persistence diagram

k-NN persistence diagram Discrepancy persistence diagram

(73)

Persistence of topographic map

Topographic map Noisy topographic map

Noisy persistence diagram

Original persistence diagram

(74)

Persistence of topographic map

Topographic map Noisy topographic map Noisy persistence diagram

(75)

Persistence of topographic map

Topographic map Noisy topographic map Noisy persistence diagram

Original persistence diagram k-NN persistence diagram Discrepancy persistence diagram

(76)

Images

No noise

40% outliers

kNN Discrepancy Median

(77)

Images

No noise 40% outliers

(78)

Images

Discrepancy Median

(79)

Images

(80)

Lena’s diagrams

(81)

Lena’s diagrams

(82)

Lena’s diagrams

(83)

Lena’s diagrams

(84)

Chessboard

(85)

Chessboard

(86)

Geometric denoising

Building a complex with good properties

We need a complex that has the correct geometric structure to analyze the scalar field.

I Use of the super-level sets of a density estimator

I Use of the sub-level sets of a distance-like function.

(87)

Geometric denoising

Building a complex with good properties

We need a complex that has the correct geometric structure to analyze the scalar field.

I Use of the super-level sets of a density estimator

I Use of the sub-level sets of a distance-like function.

(88)

Geometric denoising

Distance to measure in a nutshell

The distance to a measure is a distance-like function designed to cope with outliers.

d_µ,m(p) = v u u t 1 k

X

x∈NN_k(p)

||p−x||²

I Easy to compute pointwise.

I Guarantees of geometric inference [Chazal, Cohen-Steiner, M´erigot, 2011]

I Properties of a density estimator

[Biau, Chazal, Cohen-Steiner, Devroye, Rodrigues, 2011]

(89)

Geometric denoising

Distance to measure in a nutshell

X

x∈NN_k(p)

||p−x||²

(90)

Geometric denoising

Distance to measure in a nutshell

X

x∈NN_k(p)

||p−x||²

(91)

Geometric denoising

Distance to measure in a nutshell

X

x∈NN_k(p)

||p−x||²

(92)

Geometric denoising

A complete noise model

Three conditions:

1. Dense sampling:

∀x ∈M, d_µ,m(x)≤ 2. No cluster of noise:

r =sup{l ∈R|∀x, dµ,m(x)<l =⇒ d(x,M)≤dµ,m(x) +} 3. For any point close to M, most of the neighboring values are

good:

∀p ∈d_µ,m⁻¹(]−∞, η]), |{q ∈NNk(p)| |˜f(q)−f(π(p))| ≤s} ≥k⁰

(93)

Geometric denoising

A complete noise model

Three conditions:

1. Dense sampling:

∀x ∈M, d_µ,m(x)≤

2. No cluster of noise:

good:

(94)

Geometric denoising

A complete noise model

Three conditions:

1. Dense sampling:

r =sup{l ∈R|∀x, dµ,m(x)<l =⇒ d(x,M)≤dµ,m(x) +}

3. For any point close to M, most of the neighboring values are good:

(95)

Geometric denoising

A complete noise model

Three conditions:

1. Dense sampling:

good:

(96)

Geometric denoising

Theoretical results

Theorem

If P is a set verifying the previous conditions anf f is a c-Lipschitz fucntion then:

∀δ∈[2η+6,%(M)

2 ], δ⁰∈[2η+2+ 2R_M

RM−(η+)δ,R_M−(η+) RM

%(M)],

d_B(Dgm(f),D)ˆ ≤

cR_Mδ⁰

RM −(η+) +ξs

withξ =1 for the median andξ =1+2 q k−k⁰

2k⁰−k for the discrepancy.

(97)

Take home

I A versatile and model free algorithm for functional denoising

I Scalar field analysis with noise in both the geometry and the functional values

But...

I The algorithm needs some parameters.

I Heuristics exist but there is no general method to choose their value.

(98)

Take home

But...

(99)

Take home

But...

(100)

Take home

But...

(101)

Take home

But...

(102)

Take home

But...