Rappel : Classifieur de Bayes

(1)

Classification

Ana Karina Fermin

M2 ISEFAR

http://fermin.perso.math.cnrs.fr/

(2)

Methods:

1 k Nearest-Neighbors X

2 Generative Modeling (Naive Bayes, LDA, QDA)

3 Logistic Modeling

4 SVM

5 Tree Based Methods (M. Zetlaoui course, Apprentissage)

(3)

Binary Classification

Framework

Input measurementX= (X⁽¹⁾,X⁽²⁾, . . . ,X^(d))∈ X Output measurement Y ∈ Y.

(X,Y)∼Pwith Punknown.

Training data : D_n={(X₁,Y₁), . . . ,(X_n,Y_n)} (i.i.d. ∼P) OftenX∈R^d andY ∈ {−1,1}

A classifieris a function in F ={f :X → Y measurable}

Goal

Construct a goodclassifierf^bfrom the training data.

(4)

Loss and Probabilistic Framework

Loss function

Loss function : `(f(x),y) measure how wellf(x) “predicts" y.

`(Y,f(X)) =1_Y_6=f_(X)

Risk of a generic classifier

Risk measured as the average loss for a new couple:

R(f) =E[`(Y,f(X))] =P{Y 6=f(X)}

Beware: As ^bf depends onD_n,R(^bf) is a random variable!

Goal

Learn a rule to construct a classifier ^bf ∈ F from the training data D_n s.t. the riskR(fb) issmall on average or with high probability with respect toD_n.

(5)

Best Solution

The best solution f^∗ (which is independent of D_n) is

f^∗ = arg min

f∈FR(f) = arg min

f∈FE[`(Y,f(X))] = arg min

f∈FEX

h

EY|X[`(Y,f(x))]ⁱ

Bayes Classifier (explicit solution)

In binary classification with 0−1 loss:

f^∗(X) =











+1 if P{Y = +1|X} ≥P{Y =−1|X}

⇔P{Y = +1|X} ≥1/2

−1 otherwise

Issue: Explicit solution requires toknowE[Y|X]for all values ofX!