Classification
Ana Karina Fermin
M2 ISEFAR
http://fermin.perso.math.cnrs.fr/
Methods:
1 k Nearest-Neighbors X
2 Generative Modeling (Naive Bayes, LDA, QDA)
3 Logistic Modeling
4 SVM
5 Tree Based Methods (M. Zetlaoui course, Apprentissage)
Binary Classification
Framework
Input measurementX= (X(1),X(2), . . . ,X(d))∈ X Output measurement Y ∈ Y.
(X,Y)∼Pwith Punknown.
Training data : Dn={(X1,Y1), . . . ,(Xn,Yn)} (i.i.d. ∼P) OftenX∈Rd andY ∈ {−1,1}
A classifieris a function in F ={f :X → Y measurable}
Goal
Construct a goodclassifierfbfrom the training data.
Loss and Probabilistic Framework
Loss function
Loss function : `(f(x),y) measure how wellf(x) “predicts" y.
`(Y,f(X)) =1Y6=f(X)
Risk of a generic classifier
Risk measured as the average loss for a new couple:
R(f) =E[`(Y,f(X))] =P{Y 6=f(X)}
Beware: As bf depends onDn,R(bf) is a random variable!
Goal
Learn a rule to construct a classifier bf ∈ F from the training data Dn s.t. the riskR(fb) issmall on average or with high probability with respect toDn.
Best Solution
The best solution f∗ (which is independent of Dn) is
f∗ = arg min
f∈FR(f) = arg min
f∈FE[`(Y,f(X))] = arg min
f∈FEX
h
EY|X[`(Y,f(x))]i
Bayes Classifier (explicit solution)
In binary classification with 0−1 loss:
f∗(X) =
+1 if P{Y = +1|X} ≥P{Y =−1|X}
⇔P{Y = +1|X} ≥1/2
−1 otherwise
Issue: Explicit solution requires toknowE[Y|X]for all values ofX!