Chapter 3 Physiological signals recording and processing

3.3 Characterization of physiological activity

3.3.2 Advanced features

This section proposes specific features for each signal type based on the variables that are often analyzed in psycho-physiology.

a. EEG

The cognitive theory of emotions provides a strong motivation to go toward emotion assessment using signals from the CNS. As described in Section 2.2.1.c, several researchers have shown the involvement of brain structures in emotional processes. When using EEG to record emotional activity, most of the results were obtained by comparing the energy in different frequency bands.

For instance, Davidson [81] demonstrated the lateralization of the frontal cortex by studying the energy of the EEG alpha waves. Aftanas et al. [82] reported energy differences between more or less arousing visual stimuli (images from the IAPS). Those last differences were observed from the energy in theta bands for the parietal and occipital areas, alpha bands for the frontal areas and gamma bands for the complete scalp. However, energy is not the only feature that can be used for the purpose of emotion assessment since studies also demonstrated that there are specific patterns of synchronization between brain areas during emotional processes [61, 129]. For those reasons different types of features were defined to characterize EEG signals. They were regrouped in five feature sets according to the type of features extracted.

Energy

This set of features, named EEG_FFT, was defined to represent the energy of EEG signals in frequency bands known to be related to emotional processes [81, 82]. For each electrode i, the energy in the different frequency bands displayed in Table 3.2 was computed over all samples belonging to a specific trial, using the Fast Fourier Transform (FFT) algorithm. This was done under the assumption that the EEG signals are stationary over the whole duration of a trial.

Feature for

electrodes i Frequency band

i 4-8 Hz

i 8-12 Hz

i 12-30 Hz

Table 3.2. The energy features computed for each electrode and the associated frequency bands.

Moreover, the following EEG_W feature was added because it is known to be related to cognitive processes like workload, engagement, attention and fatigue [130-133]. To compute this feature the energy in each frequency was summed over the Ne electrodes:

1

1

_ log( )

e

e

N i i N

i i

i

EEG W (3.7)

The feature vector fEEG_FFT, which corresponds to one trial, is thus composed of 3.Ne +1 features and the EEG_FFT feature set contains all the fEEG_FFT vectors.

Lateralization of EEG alpha waves

The results obtained in [19] are at the origin of the creation of this feature set called EEG_Lateral. This study demonstrated the correlation between a EEG asymmetry score in the frontal region and reported measures of general tendency to approach or withdraw. Similar results have also been reported, with less reproducibility, for stimuli of negative and positive valence [58, 78]. The asymmetry score was computed according to the following formula:

log( ) log( )R L

AS P P (3.8)

where PR and PL are the power of the EEG signal in the alpha frequency band (defined as the 8-13 Hz band) respectively for a given right electrode and the left symmetrical electrode (for instance electrodes F4 and F3).

In this study the asymmetry score was computed for several pairs of electrodes, including non-frontal electrodes.

Asymmetry score Right electrode Left electrode

AS_1 Fp2 Fp1

AS_2 AF4 AF3

AS_3 F4 F3

AS_4 FC4 FC3

AS_5 C4 C3

AS_6 CP4 CP3

AS_7 P4 P3

AS_8 PO4 PO3

AS_9 O2 O1

Table 3.3. The pairs of electrodes used to compute 9 asymmetry scores.

For each trial, the alpha power of a given electrode was computed from the FFT applied on the complete duration of the trial; the asymmetry scores were then computed and a feature vector fEEG_Lateral was constructed by concatenation of the asymmetry scores:

_ [ _1, , _ 9]

EEG Lateral AS AS

f (3.9)

Finally, the EEG_Lateral feature set is composed of the ensemble of the fEEG_Lateral feature vectors obtained for each trial.

Average energy over brain areas

This feature set was computed to assess emotions elicited by the visualization of images with emotional content, as described in Chapter 5. The choice of those features was based on the study of Aftanas et al. [82] who showed a correlation between arousal elicited by images and responses in particular frequency bands and brain areas. As can be seen in Figure 3.6, areas were defined as groups of several electrodes (e.g. there are 6 electrodes in area PT, P, O). This figure also indicate the name of the rhythms of interest ( 1, 2, , etc.) together with the associated frequency bands.

Power values of the 6 frequency bands listed in Figure 3.6 were computed for each electrode and for the whole duration of a given trial using the FFT algorithm. As several electrodes are located in the same area, the power over all these electrodes were averaged yielding a total of 6 features for this EEG feature set (e.g. feature one is the average power in band 1 over all electrodes in areas PT, P, O). Most of the features concern the Occipital (O) lobe, which is not surprising since this lobe corresponds to the visual cortex and subjects were stimulated with pictures [82].

Figure 3.6. Top head view with EEG electrode locations and corresponding frequency bands (from [82]).

For each trial a feature vector fEEG_Area was constructed by concatenating the 6 EEG features:

_ [ _ _1, , _ _ 6]

EEG Area EEG Area EEG Area

f (3.10)

and the EEG_Area feature set was defined as the set of all vectors computed from the different trials of a given protocol.

Spectrogram

In the two feature sets, energy features were computed from EEG signals by applying the FFT algorithm on the whole duration of a trial. This could be done reliably under the assumption that the EEG signals are stationary for the duration of each trial. However this is rarely true since EEG signals can only be considered as stationary on short periods of time. This comment also applies for the computation of mean and variance features. To solve this issue the following feature set was defined under the assumption that EEG signals are stationary on short time windows of 0.5 seconds.

This set of EEG features was extracted by computing the Short-Time Fourier Transform (STFT) for each electrode with a sliding window of 512 samples and 50% overlap between consecutive windows. This window size was chosen in order to have a frequency resolution of f = 2Hz which allows to separate the different rhythms observed in EEG signals while maintaining a time resolution of 0.5 second. For each of the spectrograms (one per electrode), we selected 9 frequency bands of 2Hz ranging from 4Hz to 22Hz. This range was chosen because it includes most of the frequency bands used for the computation of the EEG_FFT and EEG_Area features set.

The fEEG_STFT feature vector for a given trial is then constructed by concatenating all the power values of the 9 frequency bands at the different time frames and for each electrode. The length of fEEG_STFT is thus: EEG_Area_5 [AT;F;C] 1 (12-18Hz) EEG_Area_6 [C;PT;P;O] 3 (22-30Hz)

( EEG STFT_ ) 9 e 4 1

lengthf N T

where 9 is the number of frequency bands, Ne the number of electrodes and 4T 1 gives the number of time frames for a trial of duration T seconds. The feature set composed of all the fEEG_STFT feature vectors (one per trial) is named EEG_STFT.

Mutual Information features

In this feature set, mutual information (MI) between pairs of electrodes is proposed as a measure of statistical dependencies between different areas of the brain. This set of features was motivated by studies that demonstrated synchronization of brains areas in emotional processes [61, 129].

With the assumption that the signal xi of electrode i for a given trial is a stochastic process with probability mass function P(Xi), the mutual information between electrodes i and j for this trial is expressed as: variables Xi and Xj. Mutual information was computed using Moddemeijer’s Matlab toolbox [134]8 that estimates the different distributions based on histograms and automatically determines an appropriate bin size.

The MI feature vector fEEG_MI of this trial is then constructed by concatenation of mutual information between each pairs of electrodes:

_

The total number of features of a trial for Ne electrodes is then:

1

set containing all feature vectors was named EEG_MI.

8 available at http://www.cs.rug.nl/~rudy/matlab/ (retrieved on 27 April 2009)

b. Galvanic Skin Response (GSR)

The mean skin resistance over a trial has been shown to be correlated with the arousal of a stimulus [7] and has been widely used for emotion assessment from peripheral physiological signals. An aroused emotion should also induce a decrease in the GSR signal. For those reasons mean skin resistance and the trend of the GSR signal were frequently added to the peripheral feature sets. However, as described in Section 2.2.2.a, the speed of the fall of a GSR signal is also of importance to characterize the EDA. To approximate this value the average decrease rate during decay time was defined as follows:

/ '( ) 0

with Nn being the number of samples where the GSR derivative GSR' is negative. The proportion of negative samples in the derivative was also computed to characterize the duration of GSR falls:

While these features can correctly characterize GSR signals when it is expected that only one EDR occurs, it is important to count the number of EDR’s if several responses occur in a single trial. For this purpose an automatic EDR detection algorithm called nbPeaks was designed. In a first step this algorithm identifies consecutive high and low peaks of a GSR signal by finding the sign changes of the signal derivative. High peaks were identified as potentially being the beginning of an EDR whereas the low peaks were identified as the potential apex of a response.

Finally a response was counted if the following criteria were met:

- the difference in amplitude between the beginning and the apex of a response is higher than 200 Ohms;

- the time elapsed between the beginning and the apex of a response is between 1 and 5 seconds.

Thus the last feature computed for the GSR signal is:

( )

NbPeaks

fGSR nbPeaks GSR (3.15)

Concerning the time aspects, the EDR is known to occur from 1 to 4 seconds after a stimulus [88]. It is thus important to record a GSR signal for more than 4 seconds after the presentation of an emotional stimulus to be sure that at least part of the reaction is recorded by the sensor.

c. Blood volume pulse (BVP)

The BVP signal given by the pletysmograph is a relative measure of the blood pressure (BP): it is correlated with the BP but does not measure its exact value. Since variations of BP and blood flow are known to be associated to several emotions (Sinha, Healey, Section 2.2.2b) the features computed from the BVP signal were the mean, the standard deviation and minimum / maximum values as described in Section 3.4.1.

d. Heart rate (HR)

The increase and decrease of HR is associated to many emotions (Section 2.2.2.c) thus the average HR computed over the whole duration of an emotional stimulus is a feature that can be used for emotion assessment. Another variable of interest for the study of emotions is the Heart Rate Variability (HRV). As detailed in Section 2.2.2.c the HRV can be determined from the HR power spectrum and by computing HR standard deviation.

Three components in the HR spectrum (frequency bands) of interest are generally considered to characterize HRV: a High Frequency band (HF, 0.15Hz-1Hz), a Low Frequency band (LF, 0.05Hz-0.15Hz) and a Very Low Frequency band (VLF, 0.0033Hz-0.05Hz). As a consequence, it is necessary to have signals of significant duration to correctly record HRV components. For instance the committee report of the Society for Psychophysiological Research [96] recommends using epochs of at least 1 minute to reliably compute the HF component and 2 minutes for the LF component. Those durations correspond to approximately 6-10 times the length of the wave with the lowest frequency (0.15Hz for HF and 0.05Hz for LF). Actually, it is possible to determine (less reliably) the energy in the HF band on epochs of 6.6 seconds and in the LF band on epochs of 20 seconds. If HRV is estimated using the standard deviation of heart rate, then it is computable on any period of time including at least 2 HR values (3 heart beats, which generally corresponds to less than 3 seconds); the result will however only provide information about the energy in frequency bands limited by the duration of the epoch.

The VLF component was not investigated in this study since it needs too long trial duration for its computation. Before computing energy in the HF and LF frequency bands the HR signal was interpolated with a cubic interpolation to obtain a new signal sampled at 1024Hz. This first step allows to:

- transform the original signal that has a low sampling rate and which is not sampled at regular intervals into a regularly sampled signal with a high sampling rate;

- obtain a signal with the same sampling rate than the others which facilitates the comparison of the signals over time (correlation measures, etc.).

The FFT algorithm was then applied to the interpolated HR signal to compute the power in the two frequency bands of interest. Finally, the ratio between the power in the HF and LF frequency bands was computed to reflect the cardiac balance between the sympathetic and parasympathetic activity (see Section 2.2.2.c). Table 3.2 summarizes the three features computed from the HR power spectrum.

Feature name Description / formula

LF

Table 3.4. The three features computed from the HR

e. Respiration

The respiration rate ranges from 0.1 to 0.35 breaths / second at rest, while it can reach 0.7 breaths / second during exercise. Moreover, if respiration is measured with a belt, irregular respiration lead to energy increases in higher frequency bands. Laughing also affects the respiration pattern by introducing high-frequency fluctuations on the recorded signal. To measure the main respiration frequency (i.e. respiration rate) at rest it is thus necessary to have a signal of at least 10 seconds (based on the lower bound of the respiration rate). Nevertheless, shorter periods of time could still contain interesting information since respiration rate at rest can increase up to 0.35 breaths / second which can be measured on an epoch of around 3 seconds. To capture those fluctuations, features from both the frequency and time domain are therefore used.

Features of the frequency domain are obtained by computing the FFT of the original signal. Then six features were computed to measure the power of the respiration signal in consecutive frequency bands. Table 3.5 summarizes the different frequency bands of interest. Those frequency bands were concatenated in a feature vector called fRespPow.

Power feature Frequency band Power feature Frequency band

1

Table 3.5. The power features computed from the respiration signals and being part of the fRespPow feature vector.

The main frequency of the respiration signal was considered to represent the respiration rate and is computed as:

0.1 0.7

arg max ( )

Rate

Resp f Resp

f P f (3.16)

where PResp(f) is the power of the respiration signal at frequency f. The precision of this feature depends on the frequency resolution obtained from the FFT which depends on the duration of the signal used for the FFT computation.

Finally, to identify the deepest inhalation or exhalation in a respiration signal Resp, the dynamic range feature was computed as:

max ( ) min ( )

DR

Resp n n

f Resp n Resp n (3.17)

f. Temperature

Generally, changes in skin temperature are quite slow and difficult to detect on short time periods. However, since skin temperature is influenced by vasoconstriction and sweat, it is possible to observe short thermal reactions of a few seconds. For instance, Ekman [5] showed that the average value of skin temperature measured on a hand over an epoch of 10 seconds could be used to distinguish anger from fear and sadness. For this reason mean temperature was used as a feature in this study. The average derivative of the temperature signal was also used to indicate the trend of the signal.

Dans le document Emotion assessment for affective computing based on brain and peripheral signals (Page 77-85)