© Lavoisier – La photocopie non autorisée est un délit
ARTICLE ORIGINAL ORIGINAL PAPER
Modelling preferences of milk products for 639 consumers: a comparison of vector
and quadratic model
B. Heyd, M. Benslama
1RÉSUMÉ
Modélisation des préférences en matière de produits laitiers pour 639 consommateurs : comparaison des modèles vectoriel et quadratique.
La cartographie des préférences utilise souvent une équation multi-linéaire pour modéliser la préférence des consommateurs en fonction de la description des pro- duits. Cet article, qui étudie les préférences de 639 consommateurs pour 24 produits laitiers, montre que ce modèle n’est pas bien adapté, mais qu’il faut lui préférer un modèle quadratique. Les meilleurs résultats sont obtenus en segmentant les consom- mateurs en groupes ayant des comportements homogènes.
Mots clés
cartographie des préférences, modèle quadratique, segmentation des consommateurs.
SUMMARY
The preference mapping often uses a multi linear equation to model the consumer’s preferences as a function of product description. This paper studies the preferences of 639 customers for 24 milk products and shows that this model is not well adapted to this task. A quadratic model give better results, but the solution is to segment the consumers’ population into groups with the same behaviour.
Keywords
preferences mapping, quadratic model, consumers’ segmentation.
1 – INTRODUCTION
The food industry aims to produce the best product, i.e. the product which maximises consumer acceptance. The variables which influence this acceptance are mainly food ingredients and food processing. Therefore, it is possible to directly improve consumer acceptance by optimising these variables. For examples see HUOR et al. (1980), LACROIX
1. UMR Genial – ENSIA – 1, avenue des Olympiades – 91744 Massy cedex – France.
© Lavoisier – La photocopie non autorisée est un délit
and CASTAIGNE (1985) and BARDOT et al. (1992). However, it is difficult to determine all the process characteristics of the competitor’s products. Nevertheless, it is possible to des- cribe each product by an expert sensory profile. This description can be considered as an implicit link between process and consumer acceptance. Considering product characte- ristics as input variables and consumer preferences as output variables, it is possible, for each consumer, to apply a model giving a predicted hedonic score as a function of pro- duct characteristics.
In external preference mapping, the expert sensory profile of the products is used to model consumer preferences. Many authors use a vector model to describe the consu- mer preferences as a function of expert profile (JONES et al., 1989; SCHLICH and MCEWANS, 1992). This model is simple and easy to use, however a major disadvantage is that an optimum cannot be defined. In many cases, the consumer does not appreciate a product which has a pronounced taste nor a product which is tasteless. For example, the function linking the tea acceptability to the time of infusion is clearly not linear. The con- sumer does not like hot water as he does not like too dark tea. This behaviour can be mathematically translated by adding a quadratic term to the model. When handling more than one input variable, the various product characteristics can also interact. Therefore, if a consumer gives a score n1 to the product P1 with the characteristic c1, and n2 to the product P2 with the characteristic c2, it is not certain that a product with the characteristic c1 and c2 would get the score n = n1 + n2. TUCHOLSKY (1957) pointed out this problem:
“Herring is good. Whipped cream is good. How good could herring be with whipped cream”. It is useful to add interaction terms into the equation.
The quadratic model is often used in the field of preference optimisation, when the aim is to model consumer preferences as a function of product characteristics (ingre- dients and/or process). So MALUNDO (1993) or MOSKOWITZ (1997) used a quadratic model to relate product variables (ingredients) to consumer preferences. It could be interesting to use the quadratic model in external preference mapping (HEYD and DANZART, 1998, BEN SLAMA et al. 1998). This model has a main drawback: it requires 6 parameters (taking out 6 degrees of freedom (df)) since the vector model requires only 3 parameters. In pre- ference mapping, it is common to explore a small set of products. RISVIK et al. (1997) per- formed their analysis on 7 products, HELGESEN et al. (1997) on only 6. Using the vector model, in the former case, let 4 df (resp 3 df) in order to compute the statistics. In those cases, using a quadratic model will be impossible since only 1 df (resp 0 df) will remain.
The data studied in this paper consists of a set of 24 products. This reasonable number of products induces a statistical dimension in the model. It is therefore possible to use models requiring more parameters, such as quadratic ones.
2 – MATERIAL AND METHODS
2.1 Data
The data (BEN SLAMA et al. 1998) consists on 24 milk products described by a panel of 16 assessors. Each assessor described each product twice with 31 sensory descrip- tors on a nine point scale. The products were also evaluated by a group of 639 consu- mers on a 10 point scale. In order to study the expert datas, we reduced the dimensionality of the problem using a principal component analysis (PCA). The consu- mers’ overall preference was calculated using a 2 way analysis of variance followed by a Fisher PLSD. To avoid scaling problems, we first centered and reduced the scores of each consumer. We then performed a cluster analysis using the Ward method to deter- mine groups of consumers with approximately the same behaviour.
© Lavoisier – La photocopie non autorisée est un délit
2.2 Model of consumer preferences
The basic model for predicting the consumer preferences as a function of expert pro- file is vectorial (equation (1)).
nk = a0k + a1k PC1 + a2k PC2 + ε (1) Where PC1 and PC2 are respectively the factorial scores of a given product on the first and the second principal component, ε the error term, and nk is the hedonic score given by the consumer k to this product. The aik are the coefficients of the model for con- sumer k.
GREENHOFF and MACFIE (1994), underlined the fact that the consumers can base their preferences on criteria that are not considered as essential by assessors. They proposed to choose the PCs which best predict consumer preferences (equation (2)).
nk = a0k + a1k PCm + a2k PCn + ε (2) where n and m are chosen to minimise the error.
Those models are often used because they have a small number of parameters, so it is possible to use them when the number of products is limited. However, it is obvious that the preferences are not always increasing or decreasing with the amount of stimulus, and that the stimuli could interact. Therefore it is interesting to use a quadratic model (equation (3)).
nk = a0k + a1k PCm + a2k PCn + a3k PCm2+ a4k PCn2+ a5k PCm PCn + ε (3) In this study by using this model, 18 degrees of freedom remain, which allows us to calculate the accuracy of this model.
2.3 Representation of the population preference
The models presented are adapted to represent the preferences of one consumer. In this case we have to represent the preferences of the 639 consumers, in order to show inter individual differences. Modelling the mean of the consumers’ scores is simple, but does not provide the essential information. This approach gives an average quadratic sur- face containing only information about hypothetical average consumer preferences. It is more interesting to choose a representation which shows clusters of consumers with dif- ferent behaviour. The preferences of all the consumers have to be shown on the same graph. The vector model is well adapted to this task. Each consumer is represented by a vector in his preference direction. The norm of this vector is the correlation coefficient for this consumer (GREENHOFF and MACFIE, 1994). However, for the quadratic model it is more difficult. One can represent consumer preferences by showing the calculated opti- mum for each consumer. There are several drawbacks with this method : the optimum of the consumer is rarely in the experimental domain, for several reasons.
• The response surface is often a minimax, so there are two optima outside of the domain.
• The response surface presents a minimum.
• The response surface has a maximum, but outside of the experimental domain.
Using this kind of representation, we forget an essential information: the shape of the surface response. We tried to represent the global preference of the population as a sur- face. We first have to determine for each consumer the domain where the products can be considered as “good”. In this case, we chose to consider as “good” the region of sur- face response where the model gives a calculated preference score better than the ave- rage consumer score. This threshold can be changed in order to improve the graphic representation. The region considered as good gets the score 1, and 0 for the rest of the domain. The binary maps of each consumer are then summed. The result is, for each location on the first plane of the PCA, a score corresponding to the number (or the per- centage) of consumers liking the product of this area. This can be represented as a sur- face plot or as a contour plot.
© Lavoisier – La photocopie non autorisée est un délit
3 – RESULT AND DISCUSSION
3.1 Consumer preferences
The two way analysis of variance shows that the products are different (α = 5%) from an hedonic point of view. Fisher PLDS (figure 1) shows that the product K is preferred by the group, when the product R is rejected (α = 5%).
This classical statistic method shows only the overall preferences. In order to make a finer analysis, one has to split the consumer population into several groups who have the same behaviour. This was obtained using the Ward method. Figure 2 indicates that the population can be splited into 3 clusters representing respectively 129, 98 and 412 consumers.
The analysis of variance can be carried out to determine in each subgroup which pro- duct is preferred. For each group the analysis of variance shows that the product scores are different (α = 5%). Then a Fisher PLSD is carried out in each subgroup. The results (figure 3) show that the preferences are not the same in each subgroup.
R D S B P U F V X Q O A L J M C G E W T N H I K 2.6 4.0 4.1 4.8 5.0 5.1 5.1 5.2 5.4 5.6 5.6 5.8 6.0 6.0 6.0 6.0 6.3 6.2 6.3 6.4 6.4 6.5 6.6 7.1 mean score
product
Figure 1
Overall preference of consumer (Fisher PLDS). Each product is represented as a letter. Linked products are not significatively differents (α = 5%).
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Group 1Group 2Group 3
Figure 2
Dendrogam of the consumers preferences. It is possible to determine 3 groups of consumers with particular behaviour, by maximizing the difference
between within-group and intergroup variance.
© Lavoisier – La photocopie non autorisée est un délit
For example, product D is one of the preferred by group 1 (129 consumers), but is rejected by the rest of the population. Product K is significantly preferred by all the con- sumers, except those belonging to group 2. Using this method, it is possible to determine groups of consumers with approximately the same behaviour. The information that the product D is preferred by 129 consumers (15% of the population) is totally masked when considering the overall preference with a classical analysis of variance. This kind of pro- cedure may be very useful in developing a new specific product adapted to a particular population.
3.2 Expert data
The performance of the experts was determined by comparing their two replicates.
The twelve experts which had the smallest variance for the replicates were selected. For each descriptor a mean of the data for the 12 experts and 2 replicates was computed. A PCA of the profile was conducted (figure 4). The first factorial plane represents 56% of the initial information. Product D seems to have a unique position, mainly due to texture properties. The products P and R have also particular properties.
It is interesting to see that the product D was determined as one of the most impor- tant for consumer’s segmentation, and that the expert’s profile point out the particular sensory attributes of this product. The profile is clearly linked with preferences.
3.3 Vector model
The aim of basic preference mapping method is to describe consumer behaviour as a vector function of the expert profile. With this model, 335 of the 639 consumers are well described (p < 0.1). It is possible to search the two best principal components. Table 1 shows that the two first principal components leads to the biggest number of valid models.
A model using the 3 best principal components (equation (4)) does not have better results (335 models with p < 0.1) and is much more difficult to interpret than the simple vector model.
nk = a0k + a1k PC1 + a2k PC2 + a3k PC3 + ε (4) X V B R U F P W N Q E A O T H S C M G J I L D K
D R S J L K G P U O F B N E Q A X H M V C I W T
R D S B V F P U Q O X C M A T J L E W I H G N K 3.3 3.7 4.0 4.0 4.2 4.3 4.3 4.4 4.8 4.9 4.9 5.1 5.6 5.7 5.8 5.9 6.1 6.1 6.3 6.3 6.4 6.5 7.0 8.7
2.8 3.2 3.9 4.2 4.2 4.5 4.9 4.9 5.0 5.3 5.6 5.7 6.3 6.3 6.3 6.3 6.4 6.6 6.6 6.8 6.8 6.9 7.0 7.1
2.2 3.6 3.7 4.8 5.1 5.1 5.2 5.3 5.5 5.7 5.7 5.8 5.8 5.8 6.3 6.4 6.5 6.5 6.5 6.6 6.6 6.6 6.8 7.6 Group 1
Group 2
Group 3 mean score product
mean score product
mean score product
Figure 3
Fisher PLSD for each group of consumer. Each product is represented by a letter.
Linked products are not significatively different (α = 5%).
© Lavoisier – La photocopie non autorisée est un délit
Table 1
Number of vector model with p < 0.1 using PCm and PCn.
3.4 Quadratic model
When using a quadratic representation, the number of valid models (p < 0.1) is 340, similar to the number obtained by the vector model, but the model fits the data much bet- ter. 149 vector models leads to r2 > 0.5 (50% of the data variance is explained by the model) when only 29 vector models can be considered as fair by the same criterion. This good result can only be obtained with a large number of experiments, in this case of pro- ducts. The number of degrees of freedom left is sufficient in order to compute the error.
Table 2 illustrates the results choosing the two principal components leading to the best model. It appears that the first two PCs give the best results.
Table 2
Number of quadratic model with p < 0.1 using PCm and PCn. n
m 1 2 3 4 5
1 0 335 272 291 243
2 335 0 220 209 165
3 272 220 0 100 101
4 291 209 100 0 78
5 243 165 101 78 0
n
m 1 2 3 4 5
1 0 340 303 195 209
2 340 0 272 188 190
3 303 272 0 145 159
4 195 188 145 0 54
5 209 190 159 54 0
– 8 – 6 – 4 – 2 0 2 4 6 8
– 6 – 4 – 2 0 2 4 6 8
A
B C
D
E
F G
H I
J K
L
M N O
P Q
R
S T
V U W
X
1st PC: 36%
2nd PC: 20%
Figure 4
First plane of the expert’s PCA.
© Lavoisier – La photocopie non autorisée est un délit
3.5 Representation of population preference
A representation of preferences for the 639 consumers is given in figure 5. For this figure, we binarised the scores of each consumer and then we summed them up. The results are given in percent of the population. It is possible to see that 80% of the consu- mers like the sensory region around of product H, and that product R is rejected (less than 20% of the consumer’s population like this zone).
The results agree with the studies of the consumers scores given figure 1. For exam- ple, product R has the lowest score. Some differences are also present; the products A and O which are located in the best area of the map, have only average scores, but this is true for all subgroups of consumers. Those two products have similar average scores, and are located in the same area of the PCA. The consumer data agrees with the panel appreciation. The ANOVA (figure 1) shows that the product scores are just above the ave- rage, the map shows that 80% of consumers find this area better than average. There is no contradiction in this. The main interest of this map is to include information about pro- duct description by an expert panel. Using this kind of representation it is possible to sum up the information given by the experts and by the consumers study.
4 – CONCLUSION
This study, carried out on 24 products, permitted us to use a large set of methods.
We consider the Ward algorithm as very useful to segment the consumer population. It is interesting to distinguish several groups of consumers having the same behaviour. This analysis, combined with two way analysis of variance seems to be very robust. A global
-8 -6 -4 -2 0 2 4 6 8
– 6 – 4 – 2 0 2 4 6 8
A
B C
D
E
F
G
H I
J K
L
M
N O
P Q
R
S
T
V U
W X
1st PC: 36%
2nd PC: 20%
20 30
50 40
70 60 80
90
Figure 5
Contour plot of the consumers’ calculated preferences representing the % of consumers liking the area as a function of the 2 PCs.
© Lavoisier – La photocopie non autorisée est un délit
treatment of the consumers’ score can lead to the rejection of some products with low average scores. A classification can detect products which have a low average score considering the whole population, but which are preferred by a subgroup of the popula- tion. This is very important from a marketing point of view.
Linking the consumers scores with sensory expert profile is also important, but diffi- cult. We encountered a common problem in modelling sensory data:
• the input variables (two first PC of expert profile) does not include all the sources of variation of the consumer scores;
• the output variables (consumer scores) are not very stable.
However, the developed quadratic model better fits the data than the vector model commonly employed and is more significant. The problem using this model is the repre- sentation of consumers’ preferences. We believe that summing up the binarised response surface is a good way to handle this large amount of data. The problem is that this kind of representation does not point out the particular behaviour of some consumers as the Ward method does. Therefore, by adding information about the product to the graph (expert profile), we lower the information about the consumers. We believe that the different methods employed in this paper are complementary, and that used together can help the comprehension of consumer behaviour and the link with product sensory characteristics.
REFERENCES
BARDOT, I., HEYD, B., TRYSTRAM, G., HOS- SENLOPP, J. and DANZART, M., 1992.
Méthode automatisée de formulation sen- sorielle pour des boissons non gazeuses.
Sciences des aliments, 12:19.
BENSLAMA, M., HEYD, B., DANZART, M., DUCAUZE, J., 1998. Plans D-optimaux : une stratégie de réduction du nombre de produits en cartographie des préférences.
Sciences des aliments, 18:471-483.
GREENHOFF and MACFIE. 1994. Measurement of Food Preferences, chapter Preference Mapping in practice. Blackie, Glasgow.
HELGESEN, H., SOLHEIM, R. and NAES, T.
1997. Consumer preference mapping of dry fermented lamb sausages. Food Qua- lity and Preference, 8(2):97-108.
HEYD, B. and DANZART, M. Modelling of con- sumer preference, an example. 1998 Lebensmittel Wissenschaft und Technologie- Food Science and Technology, 31:607-611.
HUOR, S.S., AHMED, E.M., RAO, P.V., and CORNELL, J.A. 1980. Formulation and sensory evaluation of a fruit punch contai- ning watermelon juice. Journal of food science, 45:809.
JONES, P.N., MACFIE H.J.H, and BEILKEN S.L. 1989. Uses of preference mapping
to relate consumer preferences to sen- sory properties of a process meat pro- duct (tinned cat food). J. Sci food Agric, 48:113-123.
LACROIX, C. and CASTAIGNE, F. 1985.
Influence des teneurs en protéines végé- tales, glycerol, sel et sucre sur la texture et la stabilité des émulsions cuites à base de viande de type frankfurter.
Lebensmittel Wissenschaft und Techno- logie, 18:1.
MALUNDO, T.M.M. 1993. Optimization of liquid whitenet from peanut extract.
Lebensmittel Wissenschaft und Technolo- gie, 26 (6):552-557.
MOSKOWITZ, H.R. 1997. A commercial appli- cation of rsm for ready to eat cereal. Food Quality and Preference, 8(3):191-201.
RISVIK, E., MCEWAN, J.A. and RODBOTTEN M. 1997. Evaluation of sensory profiling and projective mapping data. Food Qua- lity and Preference, 8(1):63-71.
SCHLICH, P. and MCEWANS, J.A. 1992. Car- tographie des préférences. Sciences des aliments, 10:339-355.
TUCHOLSKY, K. 1957. Man sollte mal.
Büchergilde Gutenberg, Frankfurt/M.