• Aucun résultat trouvé

Chapter 5 Assessment of emotions elicited by visual stimuli

5.2 Data collection

5.4.2 Arousal experiment

When using the two arousal ground-truth classes defined according to the IAPS judgment, the Naïve Bayes classifier average accuracy across participants exceeded the chance level only for EEG features (54% vs. 50%). The LDA classifier performed slightly better, with an average accuracy of 55%, 53% and 54% for EEG, physiological and fused features respectively. Those relatively low accuracies are likely due to large differences between the IAPS values and the actual emotion felt by the participant (as detailed in section 5.3.1.b). We concluded that in our experimental setting the IAPS arousal judgments could not be recovered from actual physiological measurements, and had to use self-assessments. However, using the LDA, the accuracy of peripheral features is higher than the random level. This confirms that the peripheral features computed in this study are more suitable for classification of the arousal dimension than the valence dimension of emotions.

Results with ground-truth classes obtained from self-evaluations are presented in Figure 5.5 and Figure 5.6. The percentage of well classified patterns for the four participants and the average across participants are shown. Compared to accuracies obtained with the ground-truth defined by

the IAPS judgments, accuracies obtained with the self-assessed ground-truth are higher, especially for participants 2 and 3 (Figure 5.5). This tends to confirm that physiological signals better correlate with personalized self assessment of emotion than with the generalized IAPS judgments. The best performance of 72% is obtained by using the EEG signals of participant 2 and a Naïve Bayes classifier. A similar result is obtained with the LDA (70%). For both classifiers, the average accuracy obtained with EEG signals is higher than the one obtained from peripheral signals. Those results stress again the added value of using EEG signals for emotional assessment.

Figure 5.5. Classifiers accuracy with 2 classes constructed from self-assessment.

It is worth mentioning that there are two drawbacks to the problem of unbalanced classes in the case three arousal classes are assessed: (i) some of the classes are strongly undersampled and (ii) the accuracy measure is less reliable since there are more samples belonging to one of the class than to the others. The first drawback implies that the probability distributions of the undersampled classes cannot be correctly determined which results in a weak assessment of those classes. Concerning the second drawback, since the number of samples in each class was equal for the two feature sets and similar across participants, we believe that the comparison of the emotion assessment performances based on this measure of accuracy is still reliable.

Figure 5.6 shows results for the three class problem. Again, the features extracted from the EEG of participant 2 yield the best result of 58% of well classified patterns (compared to a chance level of 33%). Participant 4 still obtained the worst accuracy. This is likely due to the high number of eye-blinks that were found in the EEG signals of this participant (approximately one blink per second). Participant 1 obtained better results with a Bayes classifier than with a LDA.

Extreme results for participants 2 can be explained by a better understanding of the self assessment procedure since he had a good knowledge about emotions, and was likely to accurately evaluate his feelings.

The results obtained for fusion by concatenation are different depending on the participant, the classifier and the number of defined classes. For the Naïve-Bayes classifier, the concatenation of peripheral and EEG features slightly increased the average accuracy for the 3 arousal classes and decreased it for 2 arousal classes. For the LDA, concatenation of features increased the average

accuracy by 5% for 3 arousal classes and did not affect it in the other case. Thus the LDA seems more appropriate for fusion at the feature level, which could be explained by the weak assumption of conditional independence of the Naïve-Bayes classifier. Also, fusion provides more robust results since some participants had better scores with peripheral signals than with EEG's and vice-versa.

Figure 5.6. Classifiers accuracy with 3 classes constructed from self-assessment.

5.5 Conclusion

In this chapter two categories of physiological signals, from the central and from the peripheral nervous systems, have been evaluated on the problem of assessing the arousal and the valence dimension of emotions elicited by IAPS images. Those assessments were performed as classification problems, with ground-truth valence / arousal values provided either by the IAPS or by self-assessments of the emotion. Two classifiers were used, a Naïve-Bayes classifier and a LDA.

Results showed the usability of EEG's in both arousal and valence recognition and the interest of EEG features over peripheral features. Moreover, the fusion of EEG features with peripheral features improved the assessment performance. This improvement was better with a LDA than with the Naïve-Bayes classifier. Results also markedly improved when using classes generated from self-assessment of emotions. When trying to assess emotions, one should avoid using predefined labels but rather ask for the user’s feeling. However, by using self-assessment the generated classes were unbalanced which gave rise to classification problems such as the undersampling of some classes. Moreover, using the self-assessments as a ground-truth implies that the user is an expert in evaluating his / her feelings, which is not always the case as discussed in Section 4.1.1.

Future work on arousal assessment will first aim at improving on the current results by using other non-linear classifiers, such as Support Vector Machines. Feature selection and more sophisticated fusion strategies will also be examined, jointly with the examination of other features such as temporal characteristics of signals that are known to be strongly implied in emotional processes.