• Aucun résultat trouvé

Discussion and comments. Approche graphique en analyse des données

N/A
N/A
Protected

Academic year: 2022

Partager "Discussion and comments. Approche graphique en analyse des données"

Copied!
3
0
0

Texte intégral

(1)

J OURNAL DE LA SOCIÉTÉ FRANÇAISE DE STATISTIQUE

W ILLIAM S. C LEVELAND

Discussion and comments. Approche graphique en analyse des données

Journal de la société française de statistique, tome 141, no4 (2000), p. 43-44

<http://www.numdam.org/item?id=JSFS_2000__141_4_43_0>

© Société française de statistique, 2000, tous droits réservés.

L’accès aux archives de la revue « Journal de la société française de statis- tique » (http://publications-sfds.math.cnrs.fr/index.php/J-SFdS) implique l’accord avec les conditions générales d’utilisation (http://www.numdam.

org/conditions). Toute utilisation commerciale ou impression systéma- tique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright.

Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques

http://www.numdam.org/

(2)

D I S C U S S I O N A N D C O M M E N T S Approche graphique en analyse des données

William S. CLEVELAND

1

Jean-Paul Valois has provided an exceptionally informed overview of fonda- mental currents of thinking about graphical data display. It goes well beyond a mère encyclopedia of ideas, and synthesizes new viewpoints in an interesting way. The following section titles correspond to those of his article.

SOME HISTORICAL MILESTONES

John Tukey had a profound effect on data display, partly by inventing compelling methods, such as boxplots, which hâve become standards, and partly by isolating and illustrating the rôle graphs play in the analysis of data. His influence enlisted many in the study of data display in the 1970s and 1980s, particularly statisticians, reversing the gênerai lack of interest in developing new graphical methods.

Tukey had help, though — the computer. Electronic computation took a task that required a graphie designer for its ultimate exécution as a communication device, and put it in the hands of the statistician. This replaced manuals réciting standards for graphie design, prepared by those who made pictures but did not analyze data, with writings about new ideas for display, prepared by those who faced the task of analyzing data. This greatly increased the invention of new methods.

THE ROLE OF G R A P H S A N D CHARTS

When we study a graph, and draw conclusions about the data, we are building a model for the data. The model might not be described by explicit mathematical structures and explicit probability distributions. Instead, we often dérive a fuzzy mathematical model, such as "Y appears to be an increasing convex function of X plus variation that does not dépend on X." Often the model is incomplète, not describing ail of the variation in the data. Graphs are powerful tools for such model building, either explicit mathematical models or fuzzy models, because graphs provide us with an

1. Bell Labs, Murray Hill; e-mail : [email protected]

Journal de la Société Française de Statistique, tome 141, n° 4, 2000

(3)

DISCUSSION AND COMMENTS

ability to compare an extremely wide range of alternative models, far wider than we could consider by formulating an explicit large super-model, and using formai hypothesis tests to pick a sub-model.

TYPOGRAPHY

A wholly new aspect of data, the ubiquity of very large databases, together with vast increases in the capabilities of computer graphies, hâve changed fundamentally both the needs and the possibilities for data display. An effective way to display a very large database is to expand the virtual screen real estate by allowing the graph to cover multiple pages. Today, we can routinely construct displays with hundreds of pages and study them with a document viewer. And each page can consist of tens of panels so that altogether we can hâve thousands of sub-displays. The thought of so many panels and pages might seem intimidating at first, but with the structure of trellis (lattice) display, which organizes the components of the display, and a well-designed viewer, much can be learned. Multi-page display is typically far more effective for very large databases than trying to force an immense amount of information into a single page.

THE IMPACT OF A GRAPH

Why do certain informative patterns émerge in some displays and not others ? More study of graphical perception would surely lead to more insight and better methods. But in carrying out such study, we are likely better to study the elementary particles first, and then later, the large molécules. A whole graph is an immensely large molécule because the mental processing we employ to reach conclusions about the data is so intricate and poorly understood; it involves an elaborate set of perceptual and cognitive tasks that interact in a complex way. In contrast, the judgment of the orientations of two or more line segments, which we carry out to visually décode slope, is more akin to an elementary particle.

An acute intuition about issues of graphical perception is a prerequisite to effective study. And acute intuition is most likely to arise from extensive empirical expérimentation with many sets of data, trying alternative methods of display. The intuition as a theory, even a vague undeveloped one, is vital as a guide to détermine important aspects to study, to hypothesize, and to design informative experiments about the hypothèses. As in many fields, good experiments arise from good theory.

44

Références

Documents relatifs

In a very différent tack, De Veaux et al (1999) showed that in a com- parison of data mining methods, a simple combination of applying a linear model first and then a non-linear

À chaque situation définie par un système de coordonnées et une configuration (type, nombre de variables) correspond une rangée horizontale du schéma. Les rangées sont partagées

On lira par exemple certains des dix-sept volumes des « Albums de Statistique Graphique » publiés annuellement par le Ministère des Travaux Publics (1879-1899) ou les deux volumes

In France, the honors must be shared between Bertin, who led to the revival of interest in graphies with the gênerai theory of graphie signs, their combinations and meaning

Ainsi, McCULLOUGH (1998, 1999) montre qu'avec SAS, SPSS ou S plus, vous pouvez dans certains cas calculer des moyennes et variances fausses! Quant à la résolution de

Ce nouveau modèle, d'une cartographie à la fois pratique et analytique, est présenté ainsi par l'abbé de Dangeau, en avertissement de son traité de géographie de 1697 : « ce

SCM provides a systematic way to estimate the counterfactual unit so-called synthetic control, that is, a convex combination (weighted average) of available control units that

Hoeschele (1988) developed an iterative procedure to estimate major locus genotype probabilities and effects as well as polygenic breeding values. The..