• Aucun résultat trouvé

Validation croisée

N/A
N/A
Protected

Academic year: 2022

Partager "Validation croisée"

Copied!
6
0
0

Texte intégral

(1)

Classification

Model Selection / Cross Validation

Ana Karina Fermin

M2 ISEFAR

http://fermin.perso.math.cnrs.fr/

(2)

Bias-Variance Dilemna

General setting:

F={measurable fonctionsX → Y}

Best solution: f= argminf∈FR(f) ClassS ⊂ F of functions

Ideal target inS: fS= argminf∈SR(f)

Estimate inS: bfS obtained with some procedure

Approximation error and estimation error (Bias/Variance) R(bfS)− R(f) =R(fS)− R(f)

| {z }

Approximation error

+R(fbS)− R(fS)

| {z }

Estimation error

Approx. error can be large if the modelS is not suitable.

Estimation error can be large if the model is complex.

Agnostic approach

No assumption (so far) on the law of (X,Y).

(3)

Under-fitting / Over-fitting Issue

Different behavior for different model complexity Low complexity modelare easily learned but the approximation error (“bias”) may be large (Under-fit).

High complexity model may contains a good ideal target but the estimation error (“variance”) can be large (Over-fit) Bias-variance trade-off ⇐⇒ avoidoverfitting andunderfitting

(4)

Model Selection

Models

How to design models? (Model/feature design) How to chose among several models? (Model/feature selection)

Key to obtain good performance!

Approximation error and estimation error (Bias/Variance)

R(bfS)− R(f) =R(fS)− R(f)

| {z }

Approximation error

+R(fbS)− R(fS)

| {z }

Estimation error

Approximation error can be large for not suitable model S!

Estimation error can be large if the model is complex!

Need to find the good balance automatically!

(5)

Model Selection

Empirical error biased toward complex models!

Selection criterion

Cross validation: Very efficient (and almost always used in practice!) but slightly biased as it target uses only a fraction of the data.

Penalization approach: use empirical loss criterion but penalize it by a term increasing with the complexity ofS

Rn(fbS)→Rn(fbS) + pen(S)

and choose the model with the smallest penalized risk.

Model mixing also possible...

(6)

Cross Validation

Very simple idea: use a second learning/verification set to compute a verification error.

Sufficient to avoid over-fitting!

Cross Validation

Use VV−1n observations to train and V1n to verify!

Validation for a learning set of size (1−V1n instead ofn!

Most classical variations:

Leave One Out, V-fold cross validation.

Accuracy/Speed tradeoff: V = 5 or V = 10!

Références

Documents relatifs

In this paper we focus on the problem of high-dimensional matrix estimation from noisy observations with unknown variance of the noise.. Our main interest is the high

Synthesis of the performance of LB in the case of the sphere.Convergence rate of EMNA with QR mutations (as in [14, 12]), which pro- vide the state of the art convergence rates for

Our analysis leads to new insights into stochastic approximation algorithms: (a) it gives a tighter bound on the allowed step-size; (b) the generalization error may be divided into

A partir du portrait de phase obtenu dans le sous-espace `a deux dimensions engendr´e par les deux premi`eres composantes principales de la dynamique, l’application angulaire de

Then, the defined oracle, whose relevance was empirically confirmed on realistic channels in a millimeter wave massive MIMO context, was compared to the classical LMMSE

 La durée de ces interdictions, les parties du territoire et les périodes de l'année auxquelles elles s'appliquent. Les modalités d'application de cet article

R´ esum´ e – Nous proposons une m´ethode permettant sous certaines hypoth`eses de diminuer la variance d’un estimateur optimal sans d´egrader son risque, en utilisant

ABSTRACT: Expressing the density of the determinant of a sample covariance matrix in terms of Meijer’s G-function , we provide a confidence interval for the determinant if