• Aucun résultat trouvé

Fréchet and robust statistics

N/A
N/A
Protected

Academic year: 2022

Partager "Fréchet and robust statistics"

Copied!
4
0
0

Texte intégral

(1)

J OURNAL DE LA SOCIÉTÉ FRANÇAISE DE STATISTIQUE

E LVEZIO R ONCHETTI

Fréchet and robust statistics

Journal de la société française de statistique, tome 147, no2 (2006), p. 73-75

<http://www.numdam.org/item?id=JSFS_2006__147_2_73_0>

© Société française de statistique, 2006, tous droits réservés.

L’accès aux archives de la revue « Journal de la société française de statis- tique » (http://publications-sfds.math.cnrs.fr/index.php/J-SFdS) implique l’accord avec les conditions générales d’utilisation (http://www.numdam.

org/conditions). Toute utilisation commerciale ou impression systéma- tique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright.

Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques

http://www.numdam.org/

(2)

FRECHET AND ROBUST STATISTICS

ElvezioRONCHETTI*

1. I n t r o d u c t i o n

It is a pleasure to discuss Préchet's paper and I would like to thank the Editor for giving me this opportunity. The paper originally published in 1940, contains several interesting points which are still valuable today. In my discussion I focus in particular on the aspects related to robust statistics.

More specifically, in Section 2 I comment on Préchet's view of data analysis and on his quest for alternatives to the mean and the standard déviation.

Section 3 will be devoted to discuss Préchet differentiability in statistics, a topic which is not covered in the présent paper but which plays an important rôle in the development of the theory of robust statistics and other fields.

2. D a t a Analysis a n d R o b u s t Statistics

In Section A Fréchet makes some comments on the status of mathematical statistics. The points he raises are interesting since they corne from a math- ematician who had been working in the past décade mostly in the area of analysis. Fréchet complains about some results where the assumptions are not clearly stated, about the abuse of indicators like the corrélation coeffi- cient (which incidentally reminds J.W. Tukey statement "Does anyone know when a corrélation coefficient is useful, as opposed to when it is used? " ) , and about the almost exclusive place of the mean and the standard déviation in statistical analysis.

In particular he puts into perspective the optimality of thèse two estimators by reminding the reader that this resuit is obtained under an important condition, namely the normality assumption on the distribution of the observations.

Even today this point is sometimes not well understood when the Gauss- Markov Theorem is invoked to justify the optimality of the least squares estimator (LS). It is true that LS is optimal even without the normality assumption, but in this case only among Hnear unbiased estimators of the parameter. Therefore, either LS is optimal in a gênerai class of estimators but under a very restrictive assumption (normality), or it is optimal under unrestricted distributional assumptions but only in a very restricted class of estimators (Hnear). Clearly, in both cases the situation is not satisfactory.

From an historical point of view, it is interesting to notice that Gauss in 1809 characterized the normal distribution by looking at the optimality of the mean

* Department of Econometrics, University of Geneva, Switzerland.

[email protected]

Journal de la Société Française de Statistique, tome 147, n 2, 2006

(3)

FRECHET AND ROBUST STATISTICS

from the opposite direction, i.e. assuming that the mean is the best (location) estimator, then the underlying distribution of the observations is normal.

More comments on the advantages and disadvantages of différent types of estimators are provided in the Discussion, when Fréchet answers the ques- tions raised by the audience. Incidentally the comparison between différent measures of variability reminds the dispute between Eddington and Fisher around 1920 about the relative merits of the mean absolute déviation and the standard déviation; cf. Huber (1981), Chapter 1.1.

In spite of thèse criticisms (some of them shared by Gini), Fréchet is optimistic about the future of statistics and ends Section A by saying that thèse criticisms will open up new directions in the development of statistics. Indeed he was right : thèse ideas contributed among others to the development of robust statistics and modem data analysis; cf. the séminal paper by Tukey (1962) on the future of data analysis and the fundamental papers by Tukey (1960), Huber (1964), and Hampel (1968) on robust statistics.

Section B is devoted to the quantitative investigation of the variability of the médian as measured by its variance but also by other measures of dispersion.

Fréchet dérives some bounds for thèse quantities and check the quality of thèse bounds by a simulation study( !). This is a modem approach to methodological research in statistics : the theoretical properties of some statistical procédure are first studied by means of asymptotic approximations and then the results are checked by simulation.

3. Fréchet Differentiability

In this section I briefly mention another (indirect) contribution of Fréchet to robust statistics. Hampel (1968,1974) recast the theory of robust statistics as the study of the stability properties of statistical functionals (such as an estimator, its variance, the level of a test etc.) and this opened the way to the use of différent concepts of derivatives in functional analysis ; cf. the books Huber (1981) and Hampel, Ronchetti, Rousseeuw, Stahel (1986). Three types of derivative are important for robust statistics (from the weakest to the strongest), the Gâteaux derivative (or influence function), the compact (or Hadamard) derivative, and the Fréchet derivative. A mathematical treatment of thèse concepts from a statistical perspective is provided by Fernholz (1983).

The Gâteaux derivative (or influence function) is a key tool in robust statistics.

It can be computed for most of the existing estimators and test statistics and can be used to construct new robust statistical procédures. Fréchet differentiability is a stronger concept which implies the existence of the influence function and additionally guarantees the asymptotic normality of the corresponding statistic in a shrinking neighborhood around the model ; cf.

Clarke (1986) and Bednarski (1993). This property is important to insure the stability (robustness) of the corresponding functional in the présence of small déviations from the assumed model. Clarke (1986) gives gênerai conditions which imply Fréchet differentiability for a large class of M-estimators. A key condition is the boundedness of the score function defining the M-estimator.

74

(4)

FRECHET AND ROBUST STATISTICS

Finally, notice t h a t t h e stability properties of a functional implied by Fréchet differentiability play an i m p o r t a n t rôle in other fields in statistics, including for instance t h e b o o t s t r a p .

Références

BEDNARSKI T. (1993), "Fréchet Differentiability of Statistical Functionals and Implications to Robust Statistics," in New Directions in Statistical Data Analysis and Robustness, S. Morgenthaler, E. Ronchetti, and W.A. Stahel eds., Basel : Birkhàuser, 26-34.

CLARKE B. R. (1986), "Nonsmooth Analysis and Fréchet Differentiability of M Functionals, " Probability Theory and Related Fields, 7 3 , 197-209.

FERNHOLZ L. T. (1983), von Mises Calculus for Statistical Functionals, Lectures Notes in Statistics 19, New York : Springer-Verlag.

HAMPEL F. R. (1968), "Contribution to the Theory of Robust Estimation,"Ph.D thesis, University of California, Berkeley.

HAMPEL F.R. (1974), "The Influence Curve and Its Rôle in Robust Estima- tion," Journal of the American Statistical Association, 69, 383-393.

H A M P E L F. R., R O N C H E T T I E. M., R O U S S E E U W P. J., and S T A H E L W. A. (1986),

Robust Statistics : The Approach Based on Influence Functions, New York : Wiley.

HUBER P.J. (1964), "Robust Estimation of a Location Parameter, "A nnals of Math- ematical Statistics, 35, 73-101.

HUBER P. J. (1981), Robust Statistics , New York : Wiley.

T U K E Y J.W. (1960), "A Survey of Sampling From Contaminated Distributions, "

In Contributions to Probability and Statistics, I. Olkin éd., Stanford (CA) : Stanford University Press, 448-485.

TUKEY J.W. (1962), "The Future of Data Analysis, " Annals of Mathematical Statistics, 3 3 , 1-67.

75

Références

Documents relatifs

Note that the number of layers is always smaller than the simple estimate 1/(2πR g ) (in the range 1–8) which reflects the ”roughness” of the ball. The system can be viewed as N

Section 3 will be devoted to discuss Fr´echet differentiability in statistics, a topic which is not covered in the present paper but which plays an important role in the development

Abbreviations: AcuNav, intracardiac ultrasound; AVR, aortic valve repla- cement; CBF, coronary blood flow; DAVR, direct access valve replacement; IVUS, intravascular ultrasound;

The implementation of a study and research path during three consecutive editions of a Statistics course for Business Administration sheds light on different

2014 The Dirichlet series of the multiplicative theory of numbers can be reinterpreted as partition functions of solvable quantum mechanical systems.. We

This note shows that the asymptotic variance of Chen’s [Econometrica, 70, 4 (2002), 1683–1697] two-step estimator of the link function in a linear transformation model depends on

In particular, many take the fact that (roughly) quantum statistics cannot tell apart hypothetical cases of haecceitistic difference involving indiscernible particles as

Diese betrug im Be- reich der Mundhöhle etwa 200–400 μm, während an der Stimmlippe eine Epithel- dicke von lediglich 50–100 μm gefunden wurde.. Im Gegensatz dazu war die Lami-