The Analysis and the Result - 3 The Application

3 The Application

3.2 The Analysis and the Result

The analysis was done by pre-specifying the model which have one cluster at layer 0, two clusters at layer 1 (clusters A and B), and four clusters at layer 2 where each cluster at layer 1 consists of two clusters at layer 2 (cluster A at layer 1 consists of clusters A1 and A2 at layer 2, and cluster B at layer 1 consists of clusters B1 and B2 at layer 2). The pre-specified cluster structure is shown in Fig.2.

The data were analyzed without any normalization. The resulting clusters at layers 1 and 2 are shown in Table 1. The third and the ninth columns show the cluster at layer 1, and the fourth and the 10th columns show the cluster at layer 2.

Table1also shows the result of the analysis by k-means cluster analysis (MacQueen 1967) at the 5th, the 6th, the 11th and the 12th columns which will be discussed later.

From Table2 we can see that there is a perfect correspondence between four clusters at layer 2 and the type of the whisky molt. At layer 2, cluster A1 corresponds to type E, cluster A2 corresponds to type D, cluster B1 corresponds to type G, and cluster B2 corresponds to type F. At layer 1 the correspondence is almost perfect;

cluster A corresponds to types D and E, and cluster B corresponds to types F and G. The hierarchical relationships between clusters (A and B) at layer 1 and clusters (A1, A2, B1, and B2) at layer 2 are coincide with the relationships among these types suggested inWishart(2002).

4 Discussion

A model of cluster analysis pre-specifying a cluster structure which has multiple layers and an associated algorithm to fit the model were introduced. While the algorithm of the cluster analysis is not hierarchical, the resulting tree diagram of clusters is hierarchical. The application of the model to the data on whisky molts was done using the model which has two clusters at layer 1, where each of them consists of two clusters at layer 2.

The present cluster analysis can be regarded as one sort of the constrained clas-sification (Everitt et al. 2011;Gordon 1996;Murtagh 1985), where the constraint is given not on the contiguity among objects but on the tree diagram. The constraint on the tree diagram seems different from those suggested in the past studies, but assumes a priori hierarchical structure (Arabie and Hubert 1994). The present

Cluster Analysis Based on Pre-Specified Multiple Layer Structure 111

cluster analysis also can be regarded as an equivalence of the external analysis of multidimensional scaling (Borg and Groenen 2005; Coombs 1964) in cluster analysis, where the externally given cluster structure is utilized in allocating the object to the cluster. In the case of multidimensional scaling, the external analysis is done to facilitate the interpretation of the resulting configuration, for example by imbedding the ideal point or the ideal vector, or to derive a configuration of sources based on an externally given configuration of objects (the external analysis of INDSCAL byCarroll and Chang 1970). In the present cluster analysis, the pre-specified cluster structure can make it easy to interpret the resulting tree diagram.

The advantageous aspect of the present cluster analysis is that the cluster structure having multiple layers seems easy to interpret, because layer 1 provides the classification of objects at the most coarse classification, and the next lower layer provides the next less coarse or finer classification of objects to the cluster at layer 2 which is formed by dividing the cluster at layer 1, and so on. One possible usage of it is to find the primary concern, e.g., brand primary or form primary, of a market (Arabie and Hubert 1994) by assuming specific cluster structures and by comparing the goodness of fit of the results.

The algorithm described as steps 1 through 9 in Sect.2 is similar to k-means algorithm or the simple hill-climbing algorithm of Ball and Hall (1967), where objects are simultaneously relocated but not singly. But it does not comprise k-mean algorithms (one k-means algorithm at each layer except layer 0). The algorithm deals with multiple layers which causes some discrepancies from k-means algorithm. At lower layers (not layer 0), objects are relocated across all clusters including those belong to different clusters at the upper layer. The simulated annealing algorithm might be useful to improve the algorithm (cf.Everitt et al.

2011, p. 123;Gordon 1999, p. 43). While the algorithm determined the initial center randomly, in the case of examining brand primacy of a market, it is not difficult to select the initial center based on the supposed structure reflecting brand primacy of a market.

The data were analyzed by k-means cluster analysis (MacQueen 1967) forkD2 andk D 4(two and four clusters). The resulting classification when the number of clusters is two (k D 2) is shown in the 5th and the 11th columns of Table1.

That when the number of clusters is four (k D 4) is shown in the 6th and the 12th columns of Table1. Table1tells that the classification at layer 2 given by the present cluster analysis and that given by k-means cluster analysis whenkD4completely coincide.

Table1tells that cluster 1 of k-means cluster analysis whenk D4corresponds with cluster A2 and with type D, cluster 2 corresponds with cluster B2 and with type F, cluster 3 corresponds with cluster A1 and with type E, and cluster 4 corresponds with cluster B1 and with type G. Cluster 1 derived by the k-means cluster analysis whenkD2corresponds to types F and G (four anomalies). Cluster 2 corresponds to types D an E (four anomalies). While layers 1 (clusters A and B) and 2 (clusters A1, A2, B1, and B2) derived by the present cluster analysis are inherently hierarchical, clusters 1 and 2 derived by the k-means cluster analysis whenkD2and clusters 1,

112 A. Okada and S. Yokoyama

Table 3 The correspondence between clusters whenkD2 andkD4of the k-means cluster analysis

kD4

kD2 1 2 3 4

1 0 10 2 4

2 10 0 7 2

2, 3, and 4 derived by the k-means cluster analysis whenkD4are not hierarchical as shown in Table3.

The sum of the within cluster sum of squared deviations from the centroid of the cluster at layer 2 SSW2D104:3 (22.9(cluster A1)C25.0(A2)C17.7(B1)C 38.7(B2)). The sum of the within cluster sum of squared deviations from the centroid of the cluster derived by the k-means cluster analysis whenk D 4was also 104.3, because the layer 2 of the present cluster analysis and the k-means cluster analysis whenk D 4resulted in exactly the same clusters. The sum of the within cluster sum of squared deviations from the centroid of the cluster at layer 1 SSW1D135:7 (67.6(cluster A)C68.1(B)). The sum of the within cluster sum of squared deviations from the centroid of the cluster derived by the k-means cluster analysis when k D 2 was 132.3 (71.6(cluster1)C60.7(2)), which is smaller than SSW1 given by the present cluster analysis. This is natural, because two clusters given by the k-means cluster analysis whenk D 2was obtained so that the sum of the within cluster sum of squared deviations from the centroid of the cluster for the two clusters was minimized, while the present cluster analysis derived the hierarchical structure where SSW1CSSW2 was minimized. The hierarchical structure, which the latter has to subject, deteriorated the goodness of fit but makes the interpretation of the resulting tree diagram easier.

The algorithm was executed only on the simplest cluster structure having three layers. While in principle the algorithm can deal with the cluster structure having more than three layers, whether the algorithm is effective in dealing with the structure having more than three layers has to be examined. The algorithm is valid only for the cluster structure where all clusters at a layer consist of the same number of clusters at the next lower layer. This is the reason of using the model shown in Fig.2in analyzing the data. This limitation restricts the application of the present cluster analysis. To relax the limitation so that clusters at a layer consist of different number of clusters at the next lower layer is desirable, which will increase the versatility of the present cluster analysis.

References

Arabie, P., & Hubert, L. (1994). Cluster analysis in marketing research. In R. P. Bgozzi (Ed.), Advanced methods of marketing research (pp. 160–189). Cambridge: Blackwell.

Ball, G. H., & Hall, D. J. (1967). A clustering technique for summarizing multivariate data.

Behavioral Science, 12, 153–155.

Cluster Analysis Based on Pre-Specified Multiple Layer Structure 113

Borg, I., & Groenen, P. J. F. (2005). Modern multidimensional scaling (2nd ed.). New York:

Springer.

Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via anN-way generalization of “Eckart-Young” decomposition. Psychometrika, 35, 283–319.

Coombs, C. H. (1964). A theory of data. New York: Wiley.

Everitt, B. S., Landau, S., Leese, M., & Stahl, D. (2011). Cluster analysis (5th ed.). West Sussex:

Wiley.

Gordon, A. D. (1996). A survey of constrained classification. Computational Statistics & Data Analysis, 21, 17–29.

Gordon, A. D. (1999). Classification (2nd ed.). Boca Raton: Chapman & Hall/CRC.

MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations.

In Proceeding of the 5th Berkeley symposium on mathematical statistics and probability, Berkeley (Vol. 2, PP. 281–297).

Mirkin, B. (1996). Mathematical classification and clustering. Dordrecht: Kluwer.

Murtagh, F. (1985). A survey of algorithms for contiguity-constrained clustering and related problems. The Computer Journal, 28, 82–88.

Okada, A., & Yokoyama, S. (2010). Multi-layer cluster analysis. In Proceeding of the 28th meeting of the Japanese classification society, Tokyo, Japan (pp. 11–12).

Wishart, D. (2002). Whisky classified: Choosing single molts by flavours. London: Pavillion.

Dans le document Algorithms from and for Nature (Page 124-128)