• Aucun résultat trouvé

3.5 Topology Representation

3.6.2 Swiss Roll Data Set

The Swiss roll data set (Fig.3.16, Appendix A.6.5) is a typical example of the non-linearly embedded manifolds. In this example the number of the representatives in all cases was chosen to be n = 200. Linear mapping algorithms, such Principal Component Analysis do not come to the proper result (see Fig.3.17a), because of the 2-dimensional nonlinear embedding. As can be seen in Fig. 3.17b and c CCA and OVI-NG methods are also unable to uncover the real structure of the data, as they utilise Euclidean distances to calculate the pairwise object dissimilarities.

Figure3.18shows the Isotop, CDA and GNLP-NG visualisations of the Swiss roll data set. As CDA and Isotop methods can be based on different vector quantisation methods, both methods were calculated based on the results of k-means and neural gas vector quantisation methods as well. Isotop and CDA require to build a graph to calculate geodesic distances. In these cases graphs were built based on the k-neighbouring approach, and parameter k was set to be k = 3. It can be seen that

0 5 10 15 20 25 0 10 20 30 0

5 10 15 20 25 30

Fig. 3.16 Swiss roll data set

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8

−1 −0.5 0 0.5 1 1.5 2

−1

−0.5 0 0.5 1 1.5

−1 −0.5 0 0.5 1 1.5

−0.5 0 0.5 1 1.5 2

(a) (b)

(c)

Fig. 3.17 PCA, CCA and OVI-NG projections of Swiss roll data set. a 2-dimensional PCA pro-jection. b 2-dimensional CCA propro-jection. c 2-dimensional OVI-NG projection

−4 −3 −2 −1 0 1 2 3 4

Fig. 3.18 Isotop, CDA and GNLP-NG projections of Swiss roll data set. a 2-dimensional Isotop projection with k-means VQ. b 2-dimensional Isotop projection with NG VQ. c 2-dimensional CDA projection with k-means VQ. d 2-dimensional CDA projection with NG VQ. e 2-dimensional GNLP-NG projection

0 0.2 0.4 0.6 0.8 1 0

Fig. 3.19 Topology Representing Network and metric MDS based TRNMap visualisation of the Swiss roll data set. a TRN. b DP_TRNMap

Isotop, CDA and GNLP-NG algorithms can uncover the structure of data in essence, but the Isotop method shows the manifold with some distortions.

In the following let us have a closer look at the results of Topology Representing Network Map algorithm. As TRNMap is based on the creation of the Topology Rep-resenting Network, Fig.3.19a shows the TRN of the Swiss roll data set. In Fig.3.19b the 2-dimensional metric MDS based TRNMap visualisation of the Swiss roll data set is shown (DP_TRNMap, DP from distance preservation). As the metric MDS and the non-metric MDS based mappings of the resulted TRN in this case give very similar results in the mapped prototypes, the resulted TRNMap visualisations are not distinguishable by human eyes. Thereby Fig.3.19b can be seen as the result of the non-metric MDS based TRNMap algorithm as well. In Fig.3.19b it can be seen that the TRNMap methods are able to uncover the embedded 2-dimensional manifold without any distortion.

Visualisation of the Topology Representing Network Map also includes the con-struction of the component planes. The component planes arising from the metric MDS based TRNMap are shown in Fig.3.20. The largest value of the attributes of the representatives corresponds to the black and the smallest value to the white dot surrounded by a grey circle. Figure 3.20a shows that alongside the manifold the value of the first attribute (first dimension) initially grows to the highest value, then it decreases to the smallest value, after that it grows, and finally it decreases a little.

The second attribute is invariable alongside the manifold, but across the manifold it changes uniformly. The third component starts from the highest value, then it falls to the smallest value, following this it increases to a middle value, and finally it decreases a little.

Table3.2shows the error values of distance preservation of different mappings.

The notation DP_TRNMap yields the metric MDS based TRNMap algorithm, and the notation NP_TRNMap yields the non-metric MDS based TRNMap algorithm (DP comes from distance preservation and NP from neighbourhood preservation).

Table3.2shows that GNLP-NG and TRNMap methods outperform the Isotop and

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2

Fig. 3.20 Component planes of the metric MSD based Topology Representing Network Map of the Swiss roll data set. a Dimension 1. b Dimension 2. c Dimension 3

Table 3.2 Values of Sammon stress, metric MDS stress and residual variance of Isotop, CDA, GNLP-NG and TRNMap algorithms on the Swiss roll data set

Algorithm Sammon stress Metric MDS stress Residual variance

kmeans_Isotop 0.54040 0.57870 0.41947

CDA methods. Although, GNLP-NG and TRNMap methods show similar per-formances in distance preservation, the TRNMap methods show somewhat better performances.

Figure3.21shows the neighbourhood preservation mapping qualities of the meth-ods to be analysed. It can be seen that different variations of Isotop and CDA show lower performance in neighbourhood preservation than GNLP-NG and TRN-Map methods. The continuity and the trustworthiness of GNLP-NG and TRNTRN-Map mappings do not show a substantive difference, the qualitative indicators move

0 5 10 15 20 25 30

Fig. 3.21 Trustworthiness and continuity as a function of the number of neighbours k, for the Swiss roll data set

Fig. 3.22 Trustworthiness and continuity of GNLP-NG and TRNMap methods as a function of the number of neighbours k, for the Swiss roll data set

within 0.2 %. Nevertheless, the GNLP-NG method shows better performance in the local area than the TRNMap mappings. For better visibility Fig.3.22focuses on the GNLP-NG, metric MDS based TRNMap and non-metric MDS based TRNMap mappings. In this example it has been shown both GNLP-NG and TRNMap methods are able to uncover non-linearly embedded manifolds, the TRNMap methods show good performance both in topology and distance preservation, and the component planes provide useful facilities to unfold the relations among the features.