• Aucun résultat trouvé

FACS coding was performed by the author and an undergraduate student getting credits for a Master’s degree in psychology at the University of Geneva in 2008. Reliability of facial action coding was assessed in several ways. First, the author took and succeeded the FACS final test achieving a mean agreement ratio of 0.88 with the authors of FACS (minimal requirement for passing is set at 0.70). The second coder did not attempt the final test but was trained in FACS with certified FACS coder, Prof. Susanne Kaiser. Her training took place over a period of one semester prior to the coding of the research videos. Besides using the FACS manual as a constant aid to scoring decision, we established a scoring protocol to insure adequate comparability in procedures. Scoring by both coders was performed in our lab on a computer equipped with a 17 cm screen, resolution 1680 x 1050, sampling rate 60 Hz.

During the scoring phase both the author and the assistant coder worked independently and were blind as to which judgment clusters each sequence belonged to. Each sequence was viewed mute to avoid being influenced by speech content. The first pass was viewed at normal speed to get a realistic impression of the sequence. On successive pass, we viewed the record in slow motion from beginning to end, looking at each individual AU independently, always starting with the upper face and finishing in the lower face region. When a scorable action was identified, a « start » and « end » tags were placed at the onset and offset of the event. At this stage, precise location of the event on the time line was done at a frame by frame resolution. When it was difficult to determine if an AU was involved, we reviewed the event a maximum of three times. If the uncertainty was not resolved with three reviews, we scored as though the suspected AU(s) did not occur. Thus, only the most obvious aspects of the activity were scored.

Exhaustive FACS coding can be problematic when subjects are speaking because certain lower face AUs may be involved in speech articulation: 10, l4, 16, 17, 18, 20, 22, 23, 24, 25, 26, and 28. Initially, Ekman discouraged scoring AUs 17, 18, 22, 23, 24, 25, or 26 if they coincided with speech and recommended instead an action descriptor 50 to indicate that the person was talking. In the 2002 version of the FACS Investigator manual, he revised his opinion claiming that: « …we have found since that all these actions can be scored and we now only omit 25 and 26 when 50 is scored. For almost all of these AUs the amount of action required by talking is below what has been set as the criteria for the B intensity in the FACS manual. » (Ekman and Hager, 2002). In other words, the opinion expressed is that coders should

48 be sensitive to the intensity of an action when deciding whether or not to code during speech.

When actions are more intense than needed for mere articulation these actions ought to be scored. Nevertheless, to remain cautious, we decided for this study not to score AU16, AU18 AU19, AU22, AU25, AU26, AU27, AD18 and AD19 while subjects were speaking. Any miscellaneous and other non FACS related codes were scored as described above. The Master’s student double-coded 80 video sequences (40% of the corpus) extracted randomly from the core dataset. Because, we weren’t able to recruit another person to work on the “non FACS” codes, we assessed our own intraindividual reliability for these categories. This was done by rescoring 30% of the data set on these codes, with a one year interval between the two sessions. In all cases, scoring agreement was quantified with Cohen's Kappa. Cohen's Kappa is a standard measure of observer agreement. It is defined as Kappa = (p observed - p chance) / (1 - p chance) and can vary from 0 to 1 (Cohen, 1960, Bakeman and Gottman, 1997). Coefficients ranging from 0.40 to 0.60 indicate fair reliability. From 0.61 to about 0.75 coefficients are considered good; 0.75 or higher indicate excellent reliability (Fleiss, 1981).

The reliability of FACS scoring was assessed at two levels of analysis: 1) Agreements on the occurrences of individual AU scoring; and 2) temporal precision of individual AU scoring for onsets and offsets. In a seminal work on scoring reliability of FACS codes for non acted expressions, Sayette and al. (2001) have shown that a precise frame by frame unit of measurement usually provide adequate Kappa’s, but that the coefficients significantly improve when using a 1/6th. For most purposes they consider a ½ second tolerance window acceptable. Since brief latencies are crucial to our hypotheses we found it necessary to use smaller tolerance windows. In assessing precision of scoring, we used tolerance time windows of 0, and 5 frames, which correspond to a reliability of 1/25th to 1/6th of a second, respectively. Coders were considered to agree on the occurrence of an AU if they both identified it within the same time window. Results are reported in tables 9, 10, 11 and 12.

Action Units Frames Occurrence 1/25th 1/6th 1/25th 1/6th

AU1 1913 0.79 0.71 0.79 0.57 0.68

AU2 436 0.88 0.80 0.87 0.64 0.80

AU1+2 14787 0.82 0.74 0.80 0.65 0.78

AU4 2954 0.87 0.69 0.82 0.66 0.81

AU5 8314 0.92 0.67 0.91 0.60 0.90

AU6 5342 0.70 0.58 0.70 0.49 0.61

AU7 2882 0.68 0.49 0.67 0.51 0.65

Table 9. Kappa's Coefficients for Single Upper Face Action Units Upper Face Codes

Onset Offset

Tolerance window (seconds)

49 Action Units Frames Occurrence 1/25th 1/6th 1/25th 1/6th

AU9 2552 0.85 0.69 0.80 0.66 0.78

AU10 7026 0.91 0.79 0.91 0.70 0.80

AU11 380 0.38 0.27 0.35 0.22 0.28

AU12 12881 0.93 0.87 0.90 0.59 0.68

AU12A 1359 0.74 0.65 0.72 0.50 0.61

AU12U 458 0.81 0.63 0.81 0.60 0.76

AU13 62 0.35 0.30 0.32 0.25 0.28

AU14 2964 0.78 0.69 0.76 0.58 0.70

AU14A 114 0.67 0.48 0.60 0.45 0.62

AU14U 1469 0.72 0.61 0.72 0.59 0.70

AU15 5463 0.68 0.59 0.65 0.54 0.59

AU16 1584 0.64 0.54 0.64 0.58 0.62

AU17 10027 0.82 0.75 0.81 0.72 0.75

AU18 272 0.30 0.21 0.29 0.20 0.25

AU20 2260 0.78 0.70 0.75 0.65 0.73

AU22 173 0.32 0.29 0.32 0.28 0.30

AU23 1811 0.65 0.58 0.63 0.50 0.59

AU24 1901 0.69 0.54 0.67 0.61 0.63

AU25 13149 0.92 0.82 0.90 0.70 0.88

AU26 9044 0.93 0.75 0.89 0.72 0.89

AU27 76 1.00 0.80 0.96 0.70 0.76

Table 10. Kappa's Coefficients for Single Lower Face Action Units Lower Face Codes

Onset Offset

Tolerance window (seconds)

AU and AD Frames Occurrence 1/25th 1/6th 1/25th 1/6th

AU8(25) 0 -- -- -- --

--AD19 74 0.31 0.26 0.27 0.21 0.28

AU21 607 0.49 0.43 0.48 0.40 0.45

AD29 28 -- -- -- --

--AD30 22 -- -- -- --

--AU31 0 -- -- -- --

--AD32 179 0.66 0.52 0.65 0.51 0.60

AD33 343 0.70 0.63 0.70 0.62 0.68

AD34 40 -- -- -- --

--AD35 30 -- -- -- --

--AD36 18 -- -- -- --

--AD37 351 0.82 0.75 0.80 0.70 0.76

AU38 101 0.36 0.23 0.32 0.20 0.27

AU39 101 0.28 0.21 0.25 0.22 0.30

Table 11. Kappa's Coefficients for Miscellaneous FACS Codes

Onset Offset

Miscellaneous Codes Tolerance window (seconds)

50 Action Units Frames Occurrence 1/25th 1/6th 1/25th 1/6th

Blink 9691 0.90 0.82 0.90 0.76 0.80

Eyelids Droop 4857 0.85 0.78 0.83 0.69 0.72

Look At 15222 0.96 0.80 0.94 0.79 0.92

Look Away 18828 0.95 0.70 0.89 0.83 0.90

Look Down 7602 0.80 0.76 0.79 0.74 0.79

Look Up 1070 0.87 0.69 0.82 0.70 0.78

Lower Head 1380 0.94 0.83 0.91 0.78 0.89

Head Turns 2492 0.83 0.77 0.81 0.78 0.80

Head Down 1223 0.89 0.71 0.79 0.74 0.79

Head Raise 2065 0.76 0.69 0.74 0.68 0.70

Head Raise and Turn 2980 0.92 0.73 0.92 0.80 0.89

Head Lower and Turn 2240 0.74 0.71 0.72 0.69 0.73

Head Raised 1178 0.75 0.65 0.71 0.58 0.68

Head On 15869 0.90 0.80 0.88 0.75 0.82

Head Turned Away 11237 0.71 0.60 0.69 0.65 0.70

Head Tilted Side 3147 0.83 0.74 0.80 0.70 0.80

Head Tilting Side 3205 0.70 0.49 0.58 0.60 0.69

Head Shake 2861 0.87 0.82 0.87 0.78 0.83

Head Nod 1653 0.93 0.88 0.92 0.76 0.90

Pause 15917 0.97 0.80 0.95 0.74 0.79

Speak 27748 0.96 0.90 0.95 0.91 0.94

Hesitation 1559 0.88 0.85 0.88 0.76 0.83

Verbal Filler 1411 0.77 0.70 0.75 0.68 0.72

Word Stress 3248 0.80 0.72 0.80 0.76 0.79

False Start 3247 0.96 0.88 0.94 0.85 0.92

Manipulator 0 -- -- -- --

--Autocontact 455 0.82 0.78 0.82 0.80 0.82

Laughing 1352 0.94 0.87 0.93 0.85 0.90

Crying 833 0.97 0.90 0.95 0.76 0.92

Onset Offset

Tolerance window (seconds) Table 12. Kappa's Coefficients for non FACS Codes

Non FACS Codes

Results 

Using a 1/6th-second tolerance window, all the upper and lower face action units, but four AUs 11 (Nasolabial furrow deepener), 13 (Sharp lip puller), 18 (Lip pucker) and 22 (Lip funneler) had good to excellent reliabilities for scoring onsets (see tables 9 and 10). The results are similar for offsets scoring except for AU23 (Lip tightener) whose coefficient regresses to a still acceptable 0.59 value. Generally, as the tolerance window decreased to an exact frame criterion AUs with good to excellent reliability decreased. However, even at this smallest possible tolerance window, 16 of the 27 AUs continued to have good to excellent reliability for both onset and offset scoring. Moreover, AUs 6 (Cheek raise), 7 (Lids Tight), 14A (Asymetric dimpler), 15 (Lip corner depressor), 16 (Lower lip depressor), 23 (Lip tightener) and 24 (Lip presser), still achieved acceptable scores at 1/25th-second ranging from

51 0.40 to 0.59. We are not able to report Kappa’s for 50% of the miscellaneous codes This is due in part to the low frequency of 7 codes of that category in the database : 8(25) (Lips toward each other) 29 (Jaw thrust), 30 (Jaw sideways), 31 (Jaw clencher), 34 (Puff), 35 (Cheek suck), 36 (Tongue bulge). Furthermore, agreement on occurrences for three more miscellaneous actions: 19 (Tongue show), 38 (Nostril dilate) and 39 (Nostril compress); yield unsatisfactory coefficients. The large majority of additional non FACS scores have good to excellent Kappa’s for occurrences as well as event’s start and end times at an exact frame resolution. Two exceptions are limit scores for the onsets of Head Tilting Side (1/25th and 1/60th) and the offset of Head Raise at 1/25th-second. Generally, reliability analysis indicate good to excellent scores at an exact frame resolution for both lower and upper face action units that are elements of emotion prototypes as proposed by discrete emotion theorists. One exception is AU11 « Nasolabial furrow deepener » sometimes involved in « Sadness » expressions. Note however that scored AU11 represent less than one percent of the total upper and lower face action units in the database. FACS miscellaneous codes are shown to be unfit for further analysis due to low frequency of occurrences and unreliable Kappa’s. On the other hand, our additional nonverbal categories are definitely stable in time and can be included in further multimodal analysis.

52