Digital photography: noise in images

(1)

Le débruitage d’images par patchs : point de vue statistique

Antoine Houdard IMB, Université de Bordeaux

Module Image 2019

Montpellier, 18 avril

(2)

Digital photography: noise in images

Different ISO settings with constant exposure – 25600 ISO

(3)

Digital photography: noise in images

Different ISO settings with constant exposure – 200 ISO

(4)

Digital photography

Le monde des pixels

(5)

Digital photography: general process

(6)

Digital photography: general process

(7)

Digital photography: general process

(8)

Digital photography: general process

(9)

Digital photography: general process

(10)

Digital photography: general process

(11)

Noise modeling and denoising problem

(12)

Rappels : variables aléatoires gaussiennes

V „Npµ, σ²qvariable aléatoiregaussienne, sadensitéest donnée par fVpxq “ 1

?2πσexp ˆ

´px´µq² 2σ²

˙

si V „Np0,1qalorsaV `b„Npb, a²q

si V „Npµ, σqetW „Npν, τqsontindépendantes alorsV `W „Npµ`ν, σ`τq

V „Npµ, σq, alorsErVs “µetvarrVs “σ²

Ñ dans notre cas V

i

„ N pu

i

, σ

²

q

(13)

Rappels : variables aléatoires gaussiennes

Loi des grands nombres

V₁, . . . , V_n variables aléatoires variables indépendantes et identique- ment distribuées(iid) de moyenneµet varianceσ²alors

E

„V₁` ¨ ¨ ¨ `V_n n



“µ et

var

„V₁` ¨ ¨ ¨ `V_n n



“ σ² n

Ñ en moyennant n pixels, on réduit la variance de n !

(14)

Moyenner pour débruiter !

image bruitée à 20%i.e. σ²“0.2 pour une image à valeur dansr0,1s)

(15)

Moyenner pour débruiter !

moyenne d’un stack de10 imagesbruitées indépendamment

(16)

Moyenner pour débruiter !

(17)

Moyenner pour débruiter !

(18)

Malheureusement...

...on n’a généralement accès qu’àune seule image, donc une seule réalisation de bruit !

Une première idée ?

(19)

Malheureusement...

...on n’a généralement accès qu’àune seule image, donc une seule réalisation de bruit !

Une première idée ?

(20)

Moyennes locales

L’œil humain“sait” débruiter : il est capable de faire desmoyennes locales

Ñon peut copier ce phénomène par unfiltrage gaussien:

ˆ u_i“ÿ

j

w_ijv_j

oùwij “e^´^dpi,jq^2h

(21)

Moyennes locales

L’œil humain“sait” débruiter : il est capable de faire desmoyennes locales Ñon peut copier ce phénomène par unfiltrage gaussien:

ˆ ui“ÿ

j

wijvj

où wij “e^´^dpi,jq^2h

(22)

Moyennes locales

image bruitée à 20%i.e. σ²“0.2 pour une image à valeur dansr0,1s)

(23)

Moyennes locales

filtrage gaussienh = 1

(24)

Moyennes locales

(25)

Moyennes locales

(26)

Moyennes locales

filtrage gaussien h = 10

(27)

Moyennes locales

Moyennes locales “ destruction de la structure !

(28)

Moyennes selon les pixels qui se ressemblent

Dans la littérature : filtre deYaroslavsky ou filtrebilatéral L’idée est de moyenner les pixels qui se ressemblent :

ˆ ui“ÿ

j

wijvj

avecw_ij“e^´

}vi´vj}

2hI (Yaroslavsky) ouwij “e^´^dpi,jq^2hs e^´

}vi´vj}

2hI (Bilatéral)

(29)

Moyennes selon les pixels qui se ressemblent

image bruitée à 5% i.e. σ²“0.05

(30)

Moyennes selon les pixels qui se ressemblent

filtrage bilatéralh_s“3eth_I “σ

(31)

Moyennes selon les pixels qui se ressemblent

image bruitée à 10%i.e. σ²“0.1

(32)

Moyennes selon les pixels qui se ressemblent

(33)

Moyennes selon les pixels qui se ressemblent

(34)

Moyennes selon les pixels qui se ressemblent

(35)

Moyennes selon les pixels qui se ressemblent

Moyennes pixeliques “ Peu robuste au bruit !

(36)

Moyennes non-locales

Non-local means[Buades, Coll, Morel, 2005]

Tenir compte du contextede chaque pixel Ñintroduction despatchs

ˆ ui“ÿ

j

wijvj avec wij“e^´

}Pi´Pj} 2h

http://demo.ipol.im/demo/bcm_non_local_means_denoising/

(37)

Moyennes selon les pixels qui se ressemblent

image bruitée à 5% i.e. σ²“0.05

(38)

Moyennes selon les pixels qui se ressemblent

Non-local means

(39)

Moyennes selon les pixels qui se ressemblent

(40)

Moyennes selon les pixels qui se ressemblent

Non-local means

(41)

Moyennes selon les pixels qui se ressemblent

(42)

Moyennes selon les pixels qui se ressemblent

Non-local means

(43)

Moyennes selon les pixels qui se ressemblent

Non-local means “ avènement du patch !

(44)

Patch-based image denoising

Many denoising methods rely on the description of the image by patches:

‹ NL-meansBuades, Coll, Morel (2005),

‹ BM3DDabov, Foi, Katkovnik (2007),

‹ PLEYu, Sapiro, Mallat (2012),

‹ NL-BayesLebrun, Buades, Morel (2012),

‹ LDMMShi, Osher, Zhu (2017),

‹ and many others...

(45)

Avant d’aller plus loin...

Vecteur gaussienUn vecteur aléatoireV “ pV1, . . . , Vnqestgaussien si, et seulement si,toute combinaison linéaire de ses composantes est une variable aléatoire gaussienne

Un vecteur gaussienV “ pV₁, . . . , V_nqest entièrement déterminé par sa moyenne et sa matrice de covariance :

Σ“

¨

˚

˝

VarpV1q CovpV1, V2q ¨ ¨ ¨ CovpV1, Xnq CovpV1, V2q . ..

...

CovpV1, Vnq ¨ ¨ ¨ VarpVnq

˛

‹

‚

etµ“

¨

˚

˝ EpV1q EpV2q

... EpVnq

˛

‹

‚

(46)

Avant d’aller plus loin...

V

_i

“ u

_i

` E

_i

Le vecteurE “ pE1, . . . ,Enqde bruit sur est unvecteur gaussien

V “ u ` E „ N p0, σ

²

I

_n

q

(47)

Patch-based image denoising

(48)

Patch-based image denoising

Y

_i

“ x

_i

` N

_i

Ni„Np0, σ²InqHypothèse: Ni indépendants

Pourquoi ne pas travailler directement dans l’espace des patchs ?

(49)

un NL-means dans l’espace des patchs

Un filtre Non-local légèrement modifié :

ˆ xi “

ř

jw_ijy_j ř

jwij

avec wij“1_}P_i_´P_j_}ď

Chaque patch est débruité par la moyenne empiriqued’un petit échantillon autour de lui

(50)

un NL-means dans l’espace des patchs

ˆ xi “

ř

jw_ijy_j ř

jwij

(51)

un NL-means dans l’espace des patchs

ˆ xi “

ř

jw_ijy_j ř

jwij

(52)

Modéliser l’espace des patchs

The Bayesian paradigm

‹ We consider each clean patch xas a realization of a random vectorX withpriordistributionPX.

Ñ The Gaussian white noise model rewrites:

, then Bayes’ theorem yields the posterior distribution:

P_X|Ypx|yq “P_Y_|Xpy|xqPXpxq P_Ypyq .

(53)

Débruiter sachant un modèle a priori

L’espérance conditionnelle

Meilleur estimateur du patchxen termed’erreur quadratique moyenne sachant le modèleX et l’observationy

ˆ

x

M M SE

:“ ErX |Y “ ys “ ż

xP

_X|Y

px|yqdx

Nécessite une connaissance explicite deP_X|Ypx|yq!

(54)

Débruiter sachant un modèle a priori Le maximum a posteriori

ˆ

x_{M AP} “arg max

x

P_X|Ypx|yq

“arg max

x

P_Y_|Xpy|xqP_Xpxq P_Ypyqindépendant dex

“arg min

x

´log`

P_Y_|Xpy|xq˘

´logpP_Xpxqq

“arg min

x

}x´y}²

σ² ´logpPXpxqq

On retrouve un problème variationnel avec un termed’attache aux données et un terme derégularisation

Peut être loin du MMSE !

(55)

Débruiter sachant un modèle a priori Le meilleur estimateur linéaire

L’espérance conditionnelle est un vecteur aléatoireErX|Ys “gpYqoù g“arg min

f

E“

}X´gpYq}²‰

Le meilleur estimateur linéaireLMMSE est une approximation linéairegpYq ˆ

x_{LM M SE} “Dyˆ `αˆ avec D,ˆ αˆ “arg min

D,α

E“

}X´DY ´α}²‰

siX etY admettent desmoments d’ordre 1 et 2 µ_Y, µ_Y,Σ_X,Σ_Y pour notre problème de débruitage on trouve

ˆ

x_{LM M SE} “µ_X`Σ_X`

Σ_X`σ²I˘´1

py´µ_Xq aussi appeléWiener estimator

(56)

Débruiter sachant un modèle a priori

Récapitulatif

xp“ErX|Y “ystheminimum mean square error (MMSE) estimator

xp“Dy`αs.t. D andαminimize Er}DY `α´X}²swhich is the linear MMSE also calledWiener estimator

xp“arg maxppx|yqthemaximuma posteriori (MAP)

(57)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(58)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(59)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(60)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(61)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(62)

Modéliser les patchs sans bruit X _i

(63)

Choix du modèle

Dans la littérature

Modèles gaussiens locaux

‹ patch-based PCADeledalle, Salmon, Dalalyan (2011),

‹ NL-bayesLebrun, Buades, Morel (2012),

‹ ...

Modèles de mélange de gaussiennes

‹ EPLLZoran, Weiss (2011),

‹ PLEYu, Sapiro, Mallat (2012),

‹ Single-frame Image DenoisingTeodoro, Almeida, Figueiredo (2015).

‹ ...

Why Gaussian models are so widely used?

(64)

Gaussian is convenient

SiX„NpµX,ΣXqalors tousles estimateurs introduit précédemment coïncident

x p

_MMSE

“ x p

_Wiener

“ p x

_MAP

“ µ

_X

` Σ

_X

p Σ

_X

` σ

²

I q

^´1

p y ´ µ

_X

q

la distribution deX conditionnellement àY est unvecteur gaussiende moyennemet de covarianceS

m“µX`ΣXpΣX`σ²Iq^´1py´µXq

S“Σ_X´Σ_XpΣ_X`σ²Iq^´1Σ_X

(65)

What do Gaussian models encode?

The covariance matrixin Gaussian models and GMM encodes geometric structuresup to some contrast change:

sˆs

s

Covariance matrixΣ. Patches generated fromNpm,Σq.

(66)

What do Gaussian models encode?

The covariance matrixin Gaussian models and GMM encodes geometric structuresup to some contrast change:

sˆs

s

(67)

What do Gaussian models encode?

A covariance matrixcannot encode multipletranslated versions of a structure:

A set of10000 patches representing edges with random grey levels and random translations.

(68)

What do Gaussian models encode?

A covariance matrixcannot encode multipletranslated versions of a structure:

sˆs

s

(69)

Restore with the right model

covariance matrix clean patch noisy patch denoised

(70)

Conclusion sur les modèles gaussiens

Modéliser les patchs par un modèle gaussien

permet d’avoir uneformule explicitepour l’estimateur de chaque patch

permet de représenter unestructure géométriquespécifique à l’échelle du patch

Nécessité de modèles locaux : comment grouper les patchs ?

(71)

Comment grouper les patchs ?

Non-local Bayes [Lebrun, Buades, Morel (2013)]

http://demo.ipol.im/demo/16/

(72)

Comment grouper les patchs ?

Non-local Bayes [Lebrun, Buades, Morel (2013)]

http://demo.ipol.im/demo/16/

(73)

Résultats Non-Local Bayes

(74)

Résultats Non-Local Bayes

Non-local Bayes

(75)

Résultats Non-Local Bayes

(76)

Résultats Non-Local Bayes

Non-local Bayes

(77)

Résultats Non-Local Bayes

(78)

Résultats Non-Local Bayes

Non-local Bayes

(79)

Comment grouper les patchs ?

Pourquoi aller chercher un voisinage dans une boule ?

Utiliser la géométrie de l’espace des patchs

(80)

Comment grouper les patchs ?

Pourquoi aller chercher un voisinage dans une boule ?

Utiliser la géométrie de l’espace des patchs

(81)

Comment grouper les patchs ?

On chercher àregrouper les patchs représentant la même structure

la norme} ¨ }2Ñn’est pas robuste au fort bruit

Gaussian Mixture Models (GMM) naturally provide a (more robust) grouping!

(82)

Les modèles de mélange de gaussiennes

modéliserl’espace des patchs par unmodèle de mélange de gaussiennes

X„

K

ÿ

k“1

πkNpµk,Σkq

xMMSE“

K

ÿ

PpZ “k|Y “yq“

µk`ΣkpΣk`σ²Iq^´1py´µkq‰

(83)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(84)

Comment inférer les paramètres ?

(85)

Parameters inference

Gaussian model case: X „ N p µ

_X

, Σ

_X

q

observed dataty₁, . . . , y_nusampled fromY “X`N „Npµ_Y,Σ_Yq.

The maximization of the likelihood

Lpy;θq “ 1 2

n

ÿ

i“1

py´µYq^TΣY´1

py´µYq, yields theMaximum Likelihood estimators (MLE)

µpY “ 1 n

n

ÿ

i“1

yi, ΣpY “ 1 n

n

ÿ

i“1

pyi´µpYq^Tpyi´µpYq.

SinceΣY “ΣX`σ²Ip, it yields

pµX“µpY, ΣpX“ΣpY ´σ²Ip.

(86)

Parameters inference

Gaussian Mixture Model case: X „ ř

π

_k

N pµ

_k

, Σ

_k

q

This implies a GMM on the noisy patchesY „ř

πkNpµk, Skq EM algorithm: maximize the conditional expectation of the complete log-likelihood:

K

ÿ

k“1 n

ÿ

i“1

tiklogpπkgpyi;θkqq,

wheretik“ErZ “k|yi, θ^˚sandθ^˚a given set of parameters.

E-step estimation oft_ikknowing the current parameters

M-stepcompute maximum likelihood estimators (MLE) for parameters:

pπk “nk

n, µp_k “ 1 n_k

ÿ

i

t_iky_i, Sp_k“ 1 n_k

ÿ

i

t_ikpy_i´µ_kqpy_i´µ_kq^T, withn_k “ř

it_ik.

(87)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. apprendre un modèle GMM pour les patchs sans bruit

3. Débruiter chaque patch avec le MMSE

4. Reconstruire l’image à partir de ses patchs débruités

(88)

NaN

(89)

The curse of dimensionality

Parameter estimation for Gaussian models or GMMs suffers from thecurse of dimensionality

The number of samples needed for the estimation of a parameter grows exponentially with the dimension

(90)

The curse of dimensionality

(91)

The curse of dimensionality

Consider data uniformly sampled inr0,1s^p. An hypercube of size s^1{p is needed to capture a neighborhood of a point representings% of the total volume.

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

fraction of volume 0.0

0.2 0.4 0.6 0.8 1.0

distance

p=1p=2 p=3p=10

Tocapture10% of your datain order to compute a local average, you must cover80% of the range of each input variable!

Neighborhoods are no more local!

(92)

The curse of dimensionality

Consider data uniformly sampled inr0,1s^p. An hypercube of size s^1{p is needed to capture a neighborhood of a point representings% of the total volume.

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

fraction of volume 0.0

0.2 0.4 0.6 0.8 1.0

distance

p=1p=2 p=3p=10

Tocapture10% of your datain order to compute a local average, you must cover80% of the range of each input variable!

Neighborhoods are no more local!

(93)

The curse of dimensionality

The volume of an hypercube of sizer“0.1is0.1^p so when the dimensionp grows, the probability that it contains a point of the dataset goes to zero..

In High-dimensional spaces, data are isolated

To tackle this, we need the number of data to exponentially increase withp...

(94)

The curse of dimensionality

The volume of an hypercube of sizer“0.1is0.1^p so when the dimensionp grows, the probability that it contains a point of the dataset goes to zero..

In High-dimensional spaces, data are isolated

To tackle this, we need the number of data to exponentially increase withp...

(95)

The curse of dimensionality in patches space

We consider patches of sizep“10ˆ10ÑHigh dimension.

Ñthe estimation of sample covariance matrices is difficult: ill conditioned, singular...

In the literature, this issue is generally worked around by

the use of small patches (3ˆ3or5ˆ5)NL-Bayes [Lebrun, Buades, Morel]

adding εIto singular covariance matricesPLE [Yu, Sapiro, Mallat]

fixing a lower dimension for covariance matrices S-PLE [Wang, Morel]

But, there isno reason to be afraidof this curse!

(96)

The curse of dimensionality in patches space

We consider patches of sizep“10ˆ10ÑHigh dimension.

Ñthe estimation of sample covariance matrices is difficult: ill conditioned, singular...

In the literature, this issue is generally worked around by

the use of small patches (3ˆ3or5ˆ5)NL-Bayes [Lebrun, Buades, Morel]

adding εIto singular covariance matricesPLE [Yu, Sapiro, Mallat]

fixing a lower dimension for covariance matrices S-PLE [Wang, Morel]

But, there isno reason to be afraidof this curse!

(97)

The bless of dimensionality?

In high-dimensional spaces, it is easier to separate data:

Many patchesrepresent structures that live locally in a low dimensional space: using this latent lower dimension allows to group the patches in a more robust way.

This “bless” is used in clustering algorithms designed for high-dimension High-Dimensional Data Clustering [Bouveyron, Girard, Schmid] 2007

(98)

The bless of dimensionality?

An illustration in the context of patches:

an image made of vertical stripes of width>2 pixelswith random grey levels.

(99)

The bless of dimensionality?

view 1 view 2

(100)

The bless of dimensionality?

view 1 of the first 3 pixels view 2 of the first 3 pixels The algorithm is now able to separate these classes!

(101)

Esquisse d’un algorithme

1. Extraire les patchs de l’image avec les opérateurs P

i

2. Définir et apprendre un modèle a priori avec une réduction de dimension X sur les patchs sans bruit

3. Débruiter chaque patch avec un des estimateurs précédent

4. Reconstruire l’image à partir de ses patchs débruités

(102)

High-Dimensional Mixture Models for

Image Denoising

(103)

HDMI: presentation of the model

model the clean patches X

` Z latent random variable indicating group membership

` X lives in alow-dimensionalsubspace which isspecific to its latent group:

X|Z“k „Npµk, UkΛkU_k^Tq

where Uk is a pˆdk orthogonal matrix andΛk “diagpλ^k₁,. . ., λ^k_d

kqa diagonal matrix ofsize dkˆdk.

(104)

HDMI: induced model

Induced model on the noisy patches Y

The model onX implies thatY follows a full rank GMM

ppyq “

K

ÿ

k“1

π_kgpy;µ_k,Σ_kq

whereU_kΣ_kU_k^t has the specific structure:

¨

˚

˝

ak1 0

. ..

0 akd

0

σ² 0

. ..

0 σ²

˛

‹

‚ , . -

d_k

, // . // -

pp´dkq

wherea “λ^k`σ² anda ąσ², forj“1, . . . , d .

(105)

Denoising with the HDMI model

The HDMI model being known, each patch is denoised with theMMSE

xpi“ErX|Y “yis “

K

ÿ

k“1

tikψkpyiq

wheretik is the posterior probability for the patchyi to belong in thek-th group and

ψkpyiq “µk`Uk

¨

˚

˝

ak1´σ²

a_k1 0

. ..

0 ^a^kdk^´σ

2

a_kdk

˛

‹

‚

U_k^Tpyi´µkq.

(106)

Model inference

with anEM algorithm, the parameters are updated during theM-step:

Up_k is formed by thed_k first eigenvectors of the sample covariance matrix

pakj is thej-th eigenvalue of the sample covariance matrix

The hyper-parametersK andd1, . . . , dK cannot be determined by maximizing the log-likelihoodsince they control the model complexity. ÑEach set ofK andd1, . . . , dK corresponds to a different model.

(107)

Model inference

with anEM algorithm, the parameters are updated during theM-step:

Up_k is formed by thed_k first eigenvectors of the sample covariance matrix

pakj is thej-th eigenvalue of the sample covariance matrix

The hyper-parametersKandd1, . . . , dK cannot be determined by maximizing the log-likelihoodsince they control the model complexity.

ÑEach set ofK andd1, . . . , dK corresponds to a different model.

(108)

Model inference

We propose tosetK at a given valueand tochoose the intrinsic dimensions d_k:

using anheuristicthat linksdk with the noise varianceσ² when known;

using a model selection toolin order to select the best varianceσ² when unknown.

(109)

Estimation of intrinsic dimensions – known variance

Withd_k begin fixed, theMLEfor the noise variance in thekth group is

pσ²_|k“ 1 p´dk

p

ÿ

j“dk`1

pa_kj.

When the noise varianceσis known, this gives us the following heuristic:

Heuristic. Given a value ofσ²and for k“1, ..., K, we estimate the dimensiondk by

xdk“argmin_d ˇ ˇ ˇ ˇ ˇ

1 p´d

p

ÿ

j“d`1

pakj´σ² ˇ ˇ ˇ ˇ ˇ .

(110)

Estimation of intrinsic dimensions – convergence

By re-evaluating the dimensions, we change the model at each M-step!

Question: is the convergence ensured?

dimensions

groups

the dimensions stabilizeÑthere exists an iteration where the algorithm becomes a classic EM.

(111)

Estimation of intrinsic dimensions – convergence

By re-evaluating the dimensions, we change the model at each M-step!

Question: is the convergence ensured?

dimensions

groups

the dimensions stabilizeÑthere exists an iteration where the algorithm

(112)

Estimation of intrinsic dimensions – unknown variance

Each value ofσyields a different model, we propose to select the one with the better BIC (Bayesian Information Criterion)

BICpMq “`pθq ´ˆ ξpMq 2 logpnq, whereξpMqis the complexity of the model.

Why BIC is well-adapted for the selection ofσ?

Ifσis too small, the likelihood is good but the complexity explodes;

ifσis too high, the complexity is low but the likelihood is bad.

(113)

Estimation of intrinsic dimensions – unknown variance

∆k“

¨

˚

˝

ak1 0

. ..

0 a_kd

0

σ² 0

. ..

0 σ²

˛

‹

‚ , . -

dk

, // . // -

pp´dkq

Why BIC is well-adapted for the selection ofσ?

Ifσis too small, the likelihood is good but the complexity explodes;

ifσis too high, the complexity is low but the likelihood is bad.

(114)

Summary: the HDMI algorithm

We presented the HDMI model for image denoising:

which models the full process of the generation of the noisy patches;

a fully statistical modeling without the usual “denoising cuisine”;

can be used in a “blind” way thanks to BIC selection;

attains state-of-the-art performances!

(115)

Numerical Experiments

Clean image

(116)

Numerical Experiments

Noisy imageσ“50

(117)

Numerical Experiments

Denoised with BM3D,Foi et al. 2007, psnr = 27.17dB

(118)

Numerical Experiments

Denoised with FFDNet,Zhang et al. 2018, psnr = 27.58dB

(119)

Numerical Experiments

Denoised with HDMIK“50, psnr = 27.28dB

(120)

Numerical Experiments – zooms

Clean image

(121)

Numerical Experiments – zooms

Noisy imageσ“50

(122)

Numerical Experiments – zooms

Denoised with BM3D,Foi et al. 2007, psnr = 27.17dB

(123)

Numerical Experiments – zooms

(124)

Numerical Experiments – zooms

(125)

Numerical Experiments

Clean image

(126)

Numerical Experiments

Noisy imageσ“50

(127)

Numerical Experiments

Denoised with BM3D,Foi et al. 2007, psnr = 26.55.dB

(128)

Numerical Experiments

(129)

Numerical Experiments

(130)

Numerical Experiments – zooms

Clean image

(131)

Numerical Experiments – zooms

Noisy imageσ“50

(132)

Numerical Experiments – zooms

Denoised with BM3D,Foi et al. 2007, psnr = 26.55.dB

(133)

Numerical Experiments – zooms

(134)

Numerical Experiments – zooms

(135)

Limitations of denoising in the

patch-space

(136)

The lower bound for patch-based image denoising

“Is denoising dead” [Chatterjee, Milanfar] 2010proposed a lower bound for patch-based image denoising.

In this context, denotingm_k the number of patches in thek-th group and N the total number of patches,the bound for HDMI is

E“

}u´up_HDMI}²‰ ě 1

N

K

ÿ

k“1

m_kTrpΣ_kqσ² p`σ² ,

ěC σ² Npp`σ²q

K

ÿ

k“1

mk

“C σ²

p`σ² independent of N.

even if the number of samples increases by stretching the image size to infinity, the noise variance cannot be reduced more than a factorp.

(137)

The lower bound for patch-based image denoising

HDMI (patches3ˆ10)- PSNR = 30.12

L2 grouping (patches3ˆ10)- PSNR = 25.03

(138)

The lower bound for patch-based image denoising

HDMI (patches3ˆ10)- PSNR = 30.27

L2 grouping (patches3ˆ10)- PSNR = 30.84

cropped: actual images height is 500 pixels.

(139)

The low frequency noise

Denoised with HDMIK“50, psnr = 36.47 dB

(140)

Removing low frequency noise by denoising the DC component (preprint)

Define the centered observed random variable Y_i^c“Yi´Ysi1p, where

Ys_i“ 1 p

p

ÿ

j“1

Y_ipjq,

is the DC component of the patch.

The noise model can then be divided into the two following problems Ys_i“Xs_i`Ns_iPR, (1) Y_i^c“X_i^c`N_i^cPR^p. (2)

(141)

Modélisation du bruit centré

N_i^c “N_i´NĎ_i1_p est unvecteur gaussienetN_i^c„Np0,Σ_N^c

iqavec

ΣN_i^c“ σ² p

¨

˚

˝

p´1 ´1 ¨ ¨ ¨ ´1

´1 p´1 ...

... . .. ´1

´1 ¨ ¨ ¨ ´1 p´1

˛

‹

‚ .

Il existe une matrice orthogonaleQtelle que ΣN_i^c “Q

ˆ σ²Ip´1 0

0 0

˙ Q^T.

Q^TN_i^c„Np0,diagpσ²Ip´1,0qqbruit blanc gaussienréduit d’une dimension !

(142)

Modélisation de la composante moyenne

The DC component can be reshaped as an image

Ñ

Extract patches from this image yields additive Gaussian noise problem withcolored noise

A change of basis brings us back to an additive white Gaussian noise

(143)

Débruitage des deux problème séparément

On se retrouve avec deuxproblèmes de bruit blanc gaussien

Ñutilisation de son algorithme de débruitage des patchs préféréHDMI Pour chaque patch i

XxĎi estimateur de sa moyenne

Xx_i^c estimateur du patch centré

Xpi “Xx_i^c`XxĎi1p.

(144)

Results

Noisy withσ“50

(145)

Results

Denoised with HDMIK“50, psnr = 36.47 dB

(146)

Results

+corrected DC component (HDMIK“30), psnr = 36.90 dB

(147)

Un petit résumé de ce qu’on a abordé

différentes méthodes de débruitage issus de la modélisation du bruit

une étude des modèles gaussiens pour le débruitage par patchs

une modélisation statistique de l’espace des patchs

un combat contre les mystères de la grande dimension

les limitations de ce type d’approche

(148)

Merci de votre attention !

des questions ?

Retrouvez l’article HDMI et le preprint sur ma page

(149)

Aggregation problem

Each pixel belongs inppatches:

In all the experiments here: uniform aggregation.

In the literature: there exist different aggregation methods

Ñable to improve visual resultsbutin many cases, the final pixel is still obtained from a fixed number of realizations.

(150)

Regularizing effect of the dimension reduction

(153)

The HDMI algorithm

Inputunoisy image,ppatch size,Knumber of groups,tσ1, . . . , σmulist of standard deviation.

Outputuˆdenoised image.

Extractty1, . . . , ynupatches fromu;

for σ“σ1, . . . , σm do

Initializationfew iteration of k-means.

dlÐ 8.

whiledlądo

M-stepupdate parameters and dimensionsdk

E-stepcomputetik.

update the log-likelihoodland compute the relative errordl“ |l´lex|{|l|.

lexÐl.

end while

compute the BIC for the model associated withσ end for

select the model with the better BIC.

compute denoised patchestx1, . . . , xnuwith conditional expectation;

aggregate patchesxiin order to recover the denoised imagev.

(154)

Learning on a sub-sample of the patches

Figure:Effect of the subsampling on the computing time and the denoising performance with HDMI.Left: PSNR versus sampling size. Right: Computation time versus same sampling size. Dotted-lines: 20% subsampling.

(155)

Influence of the number of group K

Figure:Denoising results (PSNR) with regard toK(left) and choice ofK with BIC (right).

(156)

Selection of σ

²