• Aucun résultat trouvé

cross-modal transfer of emotional information (happy or angry) from voices to faces in 2, 4 and 6 month-old infants

3

3 Cette expérience est une reproduction de l’article: Palama, A., Malsert, J., & Gentaz, É. (submitted). The cross-modal transfer of emotional information (happy or angry) from voices to faces in 2, 4 and 6 month-old infants.

The cross-modal transfer of emotional information (happy or angry) from voices to faces in 2, 4 and 6 month-old

infants

Amaya Palama1,2*, Jennifer Malsert1,2, Edouard Gentaz1,2,3

1 SensoriMotor, Affective and Social Development Laboratory, Faculty of Psychology and Educational Sciences, University of Geneva, Geneva, Switzerland

2 Swiss Center for Affective Sciences, Campus Biotech,University of Geneva, Geneva, Switzerland.

3 CNRS, France.

* Corresponding author

E-mail: amaya.palama@unige.ch (AP)

Postal address: Amaya Palama, 40 blvd du Pont-d’Arve, 1211 Geneva, Switzerland Phone: +4122.379.91.48

Running title: Cross-modal transfer of emotions in infants

Abstract

This present study examined whether 2-, 4- and 6-month-old infants are able to extract and transfer an amodal emotional information (i.e. indepedently of sensory modalities) from voices to faces. Thus, sequences of successive cross-modal transfers were individually shown to each of the infants. Each sequence presented a neutral or an emotional voice (angry or happy), uniquely, followed by the simultaneous presentation of two static emotional faces (angry and happy). The main result showed that only at 6 months, after listening to a happy voice, infants looked more at the incongruent angry face than the happy face (greater than chance), confirming the previous results observed in an eye-tracking study (Palama, Malsert, & Gentaz, 2018).

Moreover, the results suggested a preference looking time and number of looks for the angry face after listening to emotional voices in three age groups. These results suggested that the ability to recognize the happy emotion amodally emerge between 4 and 6 months.

Keywords: infancy; development; emotions; amodal

Résumé

Cette étude examine la compétence des bébés âgés de 2, 4 et 6 mois à extraire et transférer une information émotionnelle amodale (indépendamment de la modalité sensorielle) d’une voix à un visage. Pour étudier un tel transfert, des voix neutres ou émotionnelles (colère ou joie) ont été présentées individuellement aux bébés, suivies par la présentation d’une paire de visages émotionnels statiques (colère et joie). Le résultat principal montre que seulement à 6 mois, après l’écoute d’une voix de joie, les bébés regardent plus le visage incongruent de colère que le visage de joie (supérieur au niveau de chance), confirmant le résultat d’une étude eye-tracking (Palama, Malsert, & Gentaz, 2018). De plus, les résultats suggèrent que le visage de colère est plus regardé après l’écoute des voix émotionnelles aux trois âges étudiés. Ces résultats suggèrent que la capacité à percevoir une expression de joie de manière amodale émergerait entre 4 et 6 mois.

Mots clés : enfance ; développement ; émotion ; amodale

Introduction

Faces and voices are important sources of information for parent-infant interaction. With these features, infants’ caregivers naturally express their emotion in order to communicate.

Indeed, emotions are important for interaction and make it possible to convey one’s internal state and intentions to others (Sander & Scherer, 2014). The perception of emotional expressions is not trivial for infants and the development of this ability depends on the age, the type of emotions expressed and their mode of presentation (for reviews see: Bayet, Pascalis, &

Gentaz, 2014; Leppänen & Nelson, 2009). The occurrence of the ability to discriminate emotional in unimodal (visual or auditory stimuli) or multimodal (visuo-auditivo stimuli) conditions do not allow us to determine whether it results from an amodal representation of the emotion or from a sensitivity to specific perceptual features, visual or auditory.

Palama, Malsert and Gentaz (2018) overcame this difficulty in choosing to use a paradigm with a successive cross-modal transfer from emotional voices to emotional faces.

Thus, they examined the ability to transfer cross-modally from emotional voices to emotional faces (angry or happy) in 6-month-old infants. The aim of this experiment was to understand if

the discrimination of emotion in infants is based on physical features (visual or acoustic) or if the infants recognize emotion amodally, i.e. independently of the sensorial modality. This earlier experiment consisted of six sequences of cross-modal transfers that were individually displayed to each infant. Each sequence consisted of an auditory familiarization phase where voices (neutral and emotional: happy or angry) were presented followed by a visual test phase without any sound where the two emotional faces (happy and angry) were presented simultaneously, one familiar and the other novel vis-à-vis the emotional voices. Eye movements in response to the visual stimuli were recorded with an eye-tracker. First, results suggested no difference in infants’ looking time at the happy or angry face after listening to the neutral voice or the angry voice. Nevertheless, after listening to the happy voice, infants looked longer at the incongruent angry face (the mouth area in particular) than the congruent happy face. These results revealed that a cross-modal transfer (from auditory to visual modalities) is possible for 6-month-olds, only after the presentation of a happy voice, suggesting that they recognize this emotion amodally. These results are consistent with studies which revealed a categorical discrimination of happiness and several other emotions (surprise, sadness, fear) from 6-7 months (for reviews: (Bayet et al., 2014; Leppänen & Nelson, 2009; Nelson, 1987)).

The main goal of the present experiment is to examine the early development of the ability to transfer emotional information from voices to faces in infants aged 2, 4 and 6 months. In using the same experimental paradigm as that of the study of Palama et al. (2018) but with a classic camera to record the looking time and the number of look, we expect that this ability would be present in 6-month-old infants for happy expressions. We will analyse if this ability could also be observed earlier. Three prior abilities are necessary but not sufficient before 6 months: the auditory ability to discriminate emotions and the visual ability to discriminate emotions and a cross-modal transfer of general information from the auditory to visual modality.

First, infants at birth seem able to discriminate between emotions presented in voices, such as fearful compared to happy or neutral ones (Cheng, Lee, Chen, Wang, & Decety, 2012), happiness compared to anger, sadness or neutral (Mastropieri & Turkewitz, 1999) speech. As early as 3-months, infants detect changes in vocal expression from sadness to happiness (Walker-Andrews & Grolnick, 1983; Walker-Andrews & Lennon, 1991).

Second, infants are able to discriminate visually between happiness and other expressions. This ability seems actually already possible in newborns in some conditions (Farroni et al., 2007; Field et al., 1982; Rigato et al., 2011). However, these results are not always replicated (Kaitz, Meschulach-Sarfaty, Auerbach, & Eidelman, 1988; Oostenbroek et

al., 2016) and the happiness seems to be the only facial expression efficiently perceived.

Moreover, before 5 months, studies have found a preference for happy compared to neutral faces at 3 (Kuchuk, Vibbert, & Bornstein, 1986a) or 4 month-olds (LaBarbera, Izard, Vietze,

& Parisi, 1976a), or a preference for happy compared to sad faces at 4 months (A. J. Caron, Caron, & MacLean, 1988; Montague & Walker-Andrews, 2002). Studies also demonstrated discrimination between happiness and surprise (R. F. Caron, Caron, & Myers, 1982; Young-Browne, Rosenfeld, & Horowitz, 1977), or anger (Barrera & Maurer, 1981) at 3 months, sadness at 3-5 months (A. J. Caron et al., 1988; Montague & Walker-Andrews, 2002), neutral (Bornstein, Arterberry, Mash, & Manian, 2011) and fear (Bornstein & Arterberry, 2003) at 5 months.

Third, studies revealed that infants younger than 6 months can code information in one modality (eg. auditory or visual) and then perceive this information in another modality (eg.

visual) as suggested by Gibson (1969) (for review see Streri, 2012). Thus, there is some evidence that newborns can transfer audio-visual information for example in number perception (Izard, Sann, Spelke, & Streri, 2009) or synchrony between speech and faces (Aldridge, Braga, Walton, & Bower, 1999; Guellaï, Coulon, & Streri, 2011). Moreover, studies investigating visual and auditory speech in intermodal matching tasks observed that as of 2 months of age, infants could match vowels (Kuhl & Meltzoff, 1984, 1984; Patterson & Werker, 2003).

On the bases of these three prior abilities, we investigated whether the ability to transfer emotional information from voices to faces would also be present in 2-and-4-month-old infants for happy expressions.

Method 6.1.4.1. Participants

The final sample of the study was made up of sixty-one full-term (at least 37 weeks of gestation) infants aged 2, 4 and 6 months broken down into age groups thus: 14 2-month-old infants (6 females; mean age = 68.00 days ± 7.76, range = 57–80 days), 19 4-month-old infants (8 females; mean age = 130.89 days ± 10.46, range = 115–146 days) and 28 6-month-old infants (18 females; mean age = 184.50 days ± 9.33, range = 157-199 days). The descriptive characteristics of the sample are as follows: the mean age of the mothers was 34.14 (± 6.8) years and 35.99 (± 6.9) years for the fathers. The majority of the parents that participated in the study are married or live together (95%), while few was a single mother raising her child alone

(5%). The mothers of these infants tested reported not having been affected by perinatal depression. The family’s socioeconomic status (SES) was calculated using the Largo scale based on paternal occupation and maternal education, ranging from 2 (the highest SES) to 12 (the lowest SES) (Largo et al., 1989). The mean socioeconomic status (SES) of the family’s used in the sample was 3.59 ± 2.16, range = 2-10. Fifty-one additional infants (14 at 2 months 10 at 4 months and 27 at 6 months) were excluded due to infants behavior (N=14), failure of recording or coding videos (N=10), to side-bias (N=15), i.e. they looked to one side more than 95% of the time in at least 3 trials or were excluded for not looking more than 50% of the time (N=3), or for not looking at least for one trial (N=9). Approval for the study was given by the Ethics Committees of the Faculty of Psychology and Educational Sciences of Geneva and all parents gave written informed consent for the participation of their children in the experiment.

The experiment was performed in accordance with the relevant guidelines and regulations.

6.1.4.2. Stimuli

The auditory and visual stimuli were the same as those used by Palama et al. (2018).

The auditory stimuli were emotional nonverbal voices of happiness, anger and neutral emotion of a woman (ref: SF60) extracted from the “Montreal Affective Voice” database (Belin, Fillion-Bilodeau, & Gosselin, 2008). They were expressive onomatopoeic voices based on the emission of the vowel /a/. Each voice repeated for 20 seconds, corresponding to one second of voice and one second of break. The volume of auditory stimuli did not exceed 60 dBA. The visual stimuli were a woman’s emotional happy and angry faces (ref: SF4) extracted from “The Karolinska Directed Emotional Faces - KDEF” database (Lundqvist, Flykt, & Öhman, 1998). In these pictures the hair was not visible, each measured 9.1 x 9.1 cm and was in black and white presented on a medium gray background (RGB 100, 100, 100). Faces were presented in pairs, pseudo-randomized for the left and right presentation.

6.1.4.3. Experimental procedure

The experimental procedure was the same as in Palama et al. (2018). Each infant was comfortably installed in a suitable seat, facing a computer screen that was 60 cm away. The stimulus display screen measured 47.5 x 30 cm with a spatial resolution of 1680 x 1050 pixels.

Visual stimuli measured 8.7° x 8.7° of visual angle. To focus the infant’s attention on the screen,

just before starting the experiment, we presented a cartoon extracted from “Le Monde des petits”. The gaze on visual stimuli was recorded with a video camera (Sony HDR-CX220).

This experiment consisted of the presentation of 6 trials of audio-visual transfers sequences lasting 3 minutes for each infant (cf. Fig 1). Each trial consisted of a succession of an auditory familiarization phase followed by a visual test phase. The familiarization phase consisted in 20 seconds exposure to a voice (neutral, happy or angry prosody) accompanied by a black display screen. The test phase consisted in the presentation of a pair of emotional faces (happy and angry) for 10 seconds. The left-right position of both emotional faces was reversed for each voice. Six trials were presented in this order, in the first two, infants heard the neutral voice during the familiarization phase to obtain the baseline of spontaneous visual preferences for one of the emotional faces (angry or happy) without an emotional triggering. In the next 2 trials, infants heard one of the emotional voices during the familiarization phase, firstly the happy and secondly the angry voice. Each voice was followed by the test phase corresponding to the pair of emotional faces (angry and happy), one novel face and one familiar face to the emotional voice. The last 2 trials were same as the 2 previous ones but the faces were laterally counterbalanced in the test phase. The happy voice was presented first, to avoid the triggering of a negative reaction by the negative stimulus (Geangu et al., 2010).

Figure 43. Fig 1. Schematic representation of the successive presentation of all stimuli.

6.1.4.4. Data analysis

The looking time to the left side or the right side of the screen was recorded by a camera.

The looking times that occurred in response to the visual stimuli in each of the 6 test phases was coded offline with BORIS (Friard & Gamba, 2016) by two naïve observers with a mean of 0.90 agreement (Pearson’s r). We calculated the mean of the two observations for the analyses.

We performed repeated measures analysis of variance (ANOVA) on the total looking times and the number of looks to each side of the screen (left or right) corresponding to both emotional faces (happy and angry). The Infant’s Proportion of Total Looking Time (PTLT) was also calculated as the difference proportion of looking time to happy (>0%) or to angry (<0%) [(looking time to happy/(looking time to happy + looking time to angry)) – (looking time to angry/(looking time to happy + looking time to angry))]. One-sample t-test against chance (0%) was conducted with PTLT, to determine a looking preference for the emotional faces significantly greater than chance, more than 0% for happy and less than 0% for angry faces.

Because Palama et al. (2018) found no differences between the male and female infants, we didn’t analyze the gender effect in the present experiment. The significance threshold was .05 and Bonferroni test was performed to determine significant differences, effect sizes are given in partial eta-squared ηp2 for ANOVAs. Statistical analyses were conducted using Statistica 13.

Results

6.1.5.1. Baseline condition: analyzes of neutral voice effect on the looking time and the number of looks

Firstly, we analyzed the results of the baseline condition for the looking time toward the happy or angry face presented after the neutral voice in function of the age group (2, 4 and 6 months). After the neutral voice, we found no significant difference concerning the looking time toward the emotional faces F(1, 58) = 1.04, p = .31, ηp2 = .02, no difference between the age group F(2, 58) = 0.49, p = .61, ηp2 = .02 or interaction between the looking time directed at emotional faces and age F(2, 58) = 0.96, p = .39, ηp2 = .03.

We also analyzed the number of looks toward the happy or the angry face presented after the neutral voice in function of the age group (2, 4 and 6 months). After the neutral voice, we found that the number of looks increase with infant development. Indeed, the age effect was significant (F(1, 58) = 11.76, p <.001, ηp2 = .29), the number of looks directed at each emotional

face was higher in 6-month-old infants (2.65 ± 0.16) than in 4-months (1.80 ± 0.19; p < .01) and in 2-months (1.41 ± 0.22; p < .001). All other factors and interactions were not significant (all p >.50).

6.1.5.2. Experimental conditions: analyzes of emotional voice effect on the looking time and the number of looks

Secondly, we analyzed the looking time directed at happy and angry faces presented after the emotional voices (happy or angry) in function of the age group (2, 4 and 6 months). The age effect was not significant (F(2, 58) = 1.90, p =.16, ηp2 =.06) and did not interact with other factors (all p >.35). Particularly, the triple interaction among the age, the faces and the voices was not significant (F(2, 58) = 1.01, p = .37, ηp2 = .00). We found a main effect of the face (F(1, 58) = 9.48, p <.01, ηp2 = .14), the angry face was looked at more (4.41 ± 0.19 s.) than the happy face (3.44 ± 0.17 s.). The voice effect was not significant (F(1, 58) = 1.29, p = .26, ηp2 = .02).

The interaction between the emotional voice familiarization condition and the emotional face (F(1, 58) = 0.11, p = .74, ηp2 =.00) was not significant. Nevertheless, according to Iacobucci (2001), it is possible to examine the effect of a non-significant interaction given certain conditions. Thus, if a simple effect is significant, we can explore its effect on the second, non-significant, one. Under these circumstances, we can explore our a priori hypotheses; i.e. the effect of voices on the looking time directed at faces, as carried out in Palama et al. (2018).

Therefore, pre-planned comparisons show that, after hearing the happy voice, infants looked longer at the angry face (4.51 ± 0.22 s.) than the happy face (3.45 ± 0.23 s.), (F(1, 58) = 6.54, p

< .05). After hearing the angry voice infants tend to look longer at the angry face (4.29 ± 0.26 s.) than at the happy face (3.43 ± 0.23 s.) (F(1, 58) = 3.83, p =.055) (cf. Fig 2).

Figure 44. Fig 2. Looking time at happy or angry faces. Infants’ mean looking time (s) in function of voices

longer at the angry face than the happy face (F(1, 58) = 6.54, p < .05). The vertical bars represent positive standard errors, * p <.05.

Then, we analyzed the number of looks toward the happy or the angry face presented after the emotional voices (happy or angry) in function of the age group (2, 4 and 6 months).

The age effect was significant (F(2, 58) = 10.27, p < .001, ηp2 = .26). After Bonferroni corrections, we found that the number of looks directed at each emotional face was higher in 6-month-old infants (2.38 ± 0.11) than in 2-months (1.21 ± 0.21; p <.001) and a tendency with the 4-months (1.80 ± 0.18; p = .055). The face effect was also significant (F(1, 58) = 5.72, p

<.05, ηp2 =.29), infants directed their gaze more often to the angry face (1.86 ± 0.11) than the happy face (1.73 ± 0.11). All other factors and interactions were not significant (all p >.50).

However, we perform pre-planned comparison analyses to explore our a priori hypotheses; i.e.

the effect of voices on the number of looks directed at faces (Iacobucci, 2001). These analyses showed that after hearing the happy voice, infants tend to direct their gaze more often to the angry face (1.91 ± 0.12) than the happy face (1.77 ± 0.12), (F(1, 58) = 3.02, p=.09) but not after the angry voice (F(1, 58) = 2.50, p=.12).

6.1.5.3. Baseline and experimental conditions: analyses of the infant’s Proportion of Total Looking Time (PTLT) to happy or angry faces in function of ages and voices

Thirdly, to determine the preference for the emotional face significantly greater than chance in function of the age group (2-4 and 6 months) and the voice condition (angry, happy or neutral), we conducted a one-sample t-tests against chance of no preference (0%) under the PTLT to happy face (>0%) and the PTLT to angry face (<0%) separately by age and voice conditions (cf. Fig 3). Results suggested that only the 6-month-old infants exposed to the happy voice showed a significative looking preference for angry face (-12% ± 4%) (t(27) = -2.69, p

<.05).

Figure 45. Fig 3. PTLT to happy (>0) or angry (<0) faces: in function of voices (neutral, happy or angry) and ages (2, 4 and 6 months). After hearing a happy voice, only 6 month infants look longer at the angry face than the happy face (t(27) = -2.69, p < .05). The vertical bars represent standard errors, *p <.05.

Discussion

The aim of this experiment was to determine if 2 to 6-month-old infants are able to extract and transfer amodal components in emotional facial expressions (happy or angry) through a cross-modal transfer paradigm - from auditory to visual modalities. The present experiment suggests the presence of a cross-modal transfer from happy voice to emotional faces only at 6 months.

On the one hand, after a neutral voice, both faces are looked at equally in accordance with observed in the previous study (Palama et al., 2018). We didn’t find a spontaneous preference for the happy face as suggested by some several studies (A. J. Caron et al., 1988; Kuchuk, Vibbert, & Bornstein, 1986b; LaBarbera, Izard, Vietze, & Parisi, 1976b; Montague & Walker-Andrews, 2002). In regards to the cross-modal transfer, the absence of visual preference for one facial expression over another was expected with the neutral voice. In this emotional cross-modal transfer, after a neutral voice both angry and happy faces are novel emotions. These results were not consistent with the idea of a spontaneous visual preference for happiness.

On the other hand, after the emotional voices (angry and happy), the angry face is looked more and longer than the happy face. But after pre-planned comparisons it is especially after the happy voice that the angry face is looked at more and longer than the happy face. Moreover,

than chance was found after the happy voice. This result suggests the emergence of an emotional cross-modal transfer. However, we can express caution about the presence of such an ability. Indeed, as shown in a previous study (Palama et al., 2018), it is possible that the preference for the angry face could be affected by the saliency of the mouth. Nevertheless, it is after listening to emotional voices and especially after the happy voice that the angry face is looked at longer, suggesting that it is the novel incongruent face that drives attention more. It is consistent with a previous study in eye-tracking at 6-months (Palama et al., 2018) and the one of Montague & Walker-Andrews (2002) which demonstrates a preference for the incongruent happy or angry monther’s expression at 4 months. But most of the intermodal matching studies with angry or happy expressions found a preference for the congruent one (Soken & Pick, 1992, 1999; Vaillant-Molina, Bahrick, & Flom, 2013; Walker, 1982; Walker Andrews, 1986). This study showed the ability to transfer emotional information from voice to face, suggesting an amodal comprehension of the happy emotion at 6-months only.

The primacy of a cross-modal transfer from the happy voice to happy face may be

The primacy of a cross-modal transfer from the happy voice to happy face may be