Bayesian analysis in image processing Florence Tupin T´el´ecom ParisTech - LTCI

(1)

Bayesian analysis in image processing

Florence Tupin

T´el´ecom ParisTech - LTCI

(2)

Image classification

◦ Introduction

◦ Bayesian classification

• Image modeling

• Mono-spectral case

• Multi-spectral case

• Punctual / Contextual

◦ Kmeans classification

• Unsupervised case

• algorithm

(3)

Introduction

◦ Classification objectives

• identification of the different classes in the image

• preliminary step of pattern recognition methods (object detection)

◦ Hypotheses

• grey-level images

• classes = peaks in the histogram

• low grey-level variations in the same class

• punctual classification : each pixel is classified separatly

• supervised learning : samples of each class are available

◦ Possible extensions

• multi-channel images

(4)

Image classification

◦ Introduction

• Image modeling

• algorithm

(5)

Image modeling

◦ Probabilistic model

S set of sites (pixels = localization (i, j) in the image)

grey-level y_s ∈ {0, ...,255} = E → class x_s ∈ {1, ..., K} = Λ

↓ ↓

Y_s random variable of grey-level X_s random variable of label

(6)

Example : brain image

(7)

Example : brain image

(8)

Image classification

◦ Introduction

• Image modeling

• algorithm

(9)

Mono-spectral case (grey-level image)(1)

◦ Maximum A Posteriori criterion

you know a grey-level y_s for pixel s

⇒ take the “best” class x_s knowing y_s

⇒ find the i which maximizes P(X_s = i|Y_s = y_s) ∀i ∈ Λ

◦ Can we compute P(X_s = i|Y_s = y_s)?

Bayes rule

P(X_s = i|Y_s = y_s) = P(Y_s = y_s|X_s = i)P(X_s = i) P(Y_s = y_s)

X_s = argmaxi∈{1,...,K}P(Y_s = y_s|X_s = i)P(X_s = i)

(10)

Mono-spectral case (grey-level image) (2)

◦ Example of brain image

Λ = {0 = ∅; 1 = skin; 2 = bone; 3 = GrayM atter; 4 = W hileM atter; 5 = LCR}

• P(X_s = i) = apparition probability of class i

• P(Y_s = y_s|X_s = i) = grey-level distribution knowing that the pixels belong to class i

(11)

Mono-spectral case (grey-level image) (3)

How can we learn these probabilities ?

◦ Learning of P(X_s = i)

• frequencies of apparition for each class

• no knowledge : uniform distribution (P(X_s = i) = _Card(Λ)¹ ) ⇒ Maximum Likelihood criterion

◦ Learning of P(Y_s = y_s|X_s = i) grey-level histogram for class i

(12)

Example (brain image)

(13)

Mono-spectral case (grey-level image) (4)

◦ Supervised learning

• samples selection in an image

• histogram computation

• histogram filtering

◦ Parametric case

If there exists a parametric model for the grey-level distribution, compute the model parameters !

Ex :

• Gaussian distribution : mean, standard deviation

(14)

Mono-spectral case (grey-level image) (5)

◦ Case of a Gaussian distribution

each class i ∈ Λ is characterized by (µ_i, σ_i)

• the conditional probability is :

P(Y_s = y_s|X_s = i) = 1

√2πσ_i exp(−(y_s − µ_i)² 2σ_i² )

• if the classes are equiprobable:

P(Y_s|X_s = i) maximum ⇔ (y_s − µ_i)²

2σ_i² + ln(σ_i) minimum

• if the classes are equiprobable and have the same standard deviation (gaussian noise):

P(Y_s|X_s = i) maximum ⇔ (y_s − µ_i)² minimum

(15)

Image classification

◦ Introduction

• Image modeling

(16)

Multi-spectral case

◦ Vectorial observations

y_s ∈ {0, ...,255}^d

↓ X_s

◦ Bayes rule

x_s = argmaxi∈{1,...,K}P(Y_s = y_s|X_s = i)P(X_s = i) P(Y_s = y_s|X_s = i) multidimensionnal histogram for class i

(17)

Multi-spectral case

◦ Multi-variate Gaussian distribution

each class i ∈ Λ is characterized by (µ_i,Σ_i) (mean vector and variance-covariance matrix)

• the conditional probability is :

P(Y_s = y_s|X_s = i) = 1

√2π^dp

det(Σ_i)

exp(−1

2(y_s − µ_i)^tΣ⁻¹_i (y_s − µ_i))

• if the classes are equiprobable with the same covariance matrix:

P(Y_s|X_s = i) maximum ⇔ (y_s − µ_i)^tΣ⁻¹(y_s − µ_i) minimum

⇒ linear classifier

(18)

Image classification

◦ Introduction

• Image modeling

• algorithm

(19)

Contextual / punctual classification

◦ Global classification

y = {y_s}_s∈S (observed image), Y = {Y_s}_s∈S (random field)

x = {x_s}_s∈S (searched classification), X = {X_s}_s∈S (random field)

P(X = x|Y = y) = P(Y = y|X = x)P(X = x) P(Y = y)

◦ Independance assumption for P(Y = y|X = x) P(Y = y|X = x) = Y

s∈S

P(Y_s = y_s|X_s = x_s)

◦ Independance assumption for P(X = x)

P(X = x) = Y

s∈S

P(X_s = x_s)

(20)

Contextual / punctual classification

◦ Prior knowledge on P(X)

• independence assumption not verified in practice: images are smooth with strong spatial coherency (image description = smooth areas)

• BUT the coherency is at a local scale ⇒ introduction of contextual knowledge

Markov random fields ⇒ smoothness of the solution, local spatial coherency in the result !

(21)

Image classification

◦ Introduction

• Image modeling

(22)

Unsupervised case

◦ Bayesian case with gaussian distribution and same std

P(Y_s|X_s = i) maximum ⇔ ||y_s − µ_i||² minimum

◦ Unsupervised classification

µ_i unknown ⇒

• classify the pixels using some initial µ⁰_i

• compute new µ¹_i with the empirical mean of the classified pixels

• iterate until no modification in µ^k_i

⇒kind of bayesian classification with changing means

(23)

Unsupervised case

◦ K-means algorithm

• choose µ⁰₁, µ⁰₂, ...., µ⁰_K At iteration k:

• ∀s ∈ S l_s = argmini∈Λ||y_s − µ^k_i ||²

• ∀i ∈ {1, ..., K} µ^k+1_i = _card(R¹

i)

P

s,l_s=i x_s

• if µ^k_i 6= µ^k+1_i iterate

◦ Drawbacks

• no proof of convergence to the optimal solution

• influence of the initial means

(24)

Unsupervised case

◦ K-means algorithm for grey-level images in 1D, everything can be done with the histogram!

f_n frequency of grey-level n in the image Ex for 2 classes:

initial values of the centers m⁰₁ = 10 ; m⁰₂ = 120

initial classification: thresholding with t = 65 (histogram) computation of new means :

µ¹₁ = 1 Pn=t

n=0 f_n

n=t

X

n=0

nf_n

µ¹₂ = 1 P^n=N

n=t f_n

n=N

X

n=t

nf_n

• multi-channels : in nD, high cost of storage ⇒ updating of a classified image

• Application to high dimensionnal space : reduction with ACP

(25)

Brain image

(26)

SAR image

(27)

Imagerie radar

(28)

(29)

(30)