• Aucun résultat trouvé

Three Dimensional Plots

Dans le document Contributions Statistics (Page 55-59)

Exploratory Statistical Techniques

2.8 Three Dimensional Plots

Z 1.2

Y: LOGIO (FP) Z: DENSITY (*10 l

FIGURE 2.23. 3D-plot of a bivariate density estimate of the variables loglO(F A) and IOglO(F P). As in Figure 2.19 we can see the peeks.

Aim.

As a scatterplot is a tool to analyze the relationship between two variables a 3D-scatterplot is a tool to analyze the relationship between three variables.

Since our output tool is still a screen we need additional techniques to give the eye the impression that we looking at something three dimensional.

Exploratory Statistical Techniques 49

2.50%

FIGURE 2.24. 3D-plot of a trivariate kernel density estimate of the vari-ables loglO(FA), loglO(FP) and FR (bandwidth h = (0.23,0.32,0.75».

As expected we have a clear relationship between these variables. If they would be uncorrelated the density estimate would look more like unit balls. Instead of different colours we have used the gray scale representa-tion of them.

Spinning.

The first attempt has been done with spinning which means we rotate the dataset parallel to one of the screen coordinate axes (see e.g. MacSpin). A lot of problems has risen in this context. The first one is the internal representa-tion of a three dimensional dataset such that a rotarepresenta-tion will appear to the eye as rotation and not as a set of blinking pictures. Models have been developed to represent the continuous data internally on an integer grid and to executed the rotation on this grid. Fortunately the numerical and graphical power of computers has improved so much that this is no longer a problem.

Sizing.

Another possibility to get a three dimensional effect is to draw the datapoints which are closer to the observer thicker than those being far away. If we rotate this dataset we need additional computational effort to compute the distance from the observer. We also loose the possibility to supply a datapoint with a form of arbitrary size.

50 Exploratory Statistical Techniques Stereoplot.

Another approach is to split a datapoint into two datapoints which have a small distance from each other. If we colour one datapoint red and the other one green and if we use red-green-glasses we will get a three dimensional picture of our dataset. The disadvantage is of course that we had to double the number of observations and that we always needed red-green-glasses.

Rocking.

A much more interesting technique seems to be the "rocking" of a dataset.

If we look at a three dimensional scatterplot the picture does not stand still but moves between two position by rotation. Since datapoints being more distant will move by greater distances than closer observations we are able to recognize how far away the observations is compared to the other datapoints.

The advantage is that we only have to compute two different positions for that moment when we stop the rotation. The computational effort is not too big and the routines for the rotation are already available.

Surface.

3D-scatterplots are not only used to show datapoints. They are also used to show different kinds of surfaces (see Figure 2.23 and Figure 2.24).

For the trivariate kernel density estimate in Figure 2.24 an interactive choice is necessary of the levels cred, Cgreen and ~lue to plot the contours of

Colour models.

Since we are using colours we have to choose between different colour models.

As each model uses a different basis to compose a colour, each has its own advantages and disadvantages:

• RGB

RGB (red-green-blue) is the most commomly used colour model. Our TV pictures on the screen use this model. Every colour is composed of a partition of red, green and blue. We have a lot of knowledge avail-able about the eye's sensitivity to RGB-triplets. Every window system provides an RGB-triplet for composing a colour. A problem appears if someone has to compose a colour by himself as some experience is needed.

Exploratory Statistical Techniques 51

Color models

'.,..,-_ _ _ ----:"White Bluet'-_-t-_ _ <.

:A-_ _ _ ---:'?I,BIICk

Yellow Blick

Blue RGB-Model

CMY-Model

Cya

Black

HSV-Model

HLS-Model

FIGURE 2.25. Representation of colour models

• YIQ

The YIQ model was designed for transmission efficiency in colour broad-cast TV. It can be calculated from the RGB-model by

• CMY

0.30 0.60 0.21

0.59 -0.28 -0.52

0.11 -0.32

0.31

)( ~)

The CMY (cyan-magenta-yellow) is widely used for colour printing de-vices. It can be calculated easily from the RGB-model by

( f) (E~)

52 Exploratory Statistical Techniques

• HSVjHLS

The HSV- (hue-saturation-value) and the HLS-model (hue-lightness-saturation) are designed for a user-friendly composition of colours. The hue distinguishes between different colours like red, yellow, green, cyan, blue and magenta. The saturation describes how little the colour is diluted with white, e.g. pink and red, sky blue and royal blue .... The lightness or value describes the intensity of a colour. Both systems can be represented by a single- or double-hexacone.

• Munsell system

None of the described models consider the sensitivity of the human eye.

In computer systems the RGB model is composed by three integers which have a range of 255 (2563 = 16.7 Mio.). But can a human eye really distinguish the colour (0,0,0) from (1, 1, I)? So Munsell built up a scale such that we have equally perceived distances in colour space.

This scale is subjective, but it is based upon the evaluation of many observers.

The statistical importance of colour scale appears in contour plots in the three dimensional case (Scott 1992) and in image plots in the two dimensional case.

The three most interesting colour-models, RGB, HSV and HLS can easily be implemented in statistical software. The Munsell system is based on huge tables so that an implementation is only done if necessary.

A problem that often arises in (statistical) programs is how to transfer the background colour of the screen (mostly black) to the background colour of the printer (mostly white). An easy exchange of black and white is not possible, because if someone uses a gray scale starting with white and ending with black, e.g. in a contour plot or in an image plot, the exchange would destroy the whole palette. To solve such a problem the HLS system can be used. The RGB-colour will be translated to HLS-colour and the saturation s will be set to 1-s. That ensures that colour with s

=

0.5 will not change the RGB-colour. Additionally the light colours which are a strong contrast to a dark background will become dark colours on the printer, which also will be a strong contrast on the paper.

Dans le document Contributions Statistics (Page 55-59)