• Aucun résultat trouvé

4 E MERGENT LIQUID STATE AFFECT (ELSA): A NEW MODEL FOR THE DYNAMIC SIMULATION

4.5 Data simulation

4.5.2 Synthetic data

Data and design. The purpose of the synthetic data was to demonstrate the most important properties of the ELSA model, using data that contained known effects and that imitated the type of measurements that might be recorded during a real emotion elicitation experiment. We constructed the data as a challenge for ELSA to handle the following problems:

 Distinguishing signal from noise data

 Modelling delay effects from one time series to another

 Modelling nonlinearity

 Handling between-participant heterogeneity of measurements

 Detecting complete transient synchronization of measurements

We devised a fictitious study with 40 participants. Appraisals were induced by presenting a stimulus of which we expected that it would be appraised in a certain way. For example, participants were shown a novel stimulus and we expected them to appraise this stimulus as novel. The responses of each participant were “recorded” once for 300 time steps, of which 150 steps represented the pre-stimulus baseline, 50 steps represented a manipulation window in which a pre-stimulus was presented, and 100 steps represented the post-stimulus phase. The appraisal input for the synthetic simulation corresponded to a binary vector that coded for the absence-presence of the stimulus, i.e. {0,1,0} for {150,50,100} steps respectively (see Figure 4.11, bottom row).

Emotion measurements consisted of 2 variables each for the motivation, physiology, and expression components. For each pair, one variable consisted of random white noise (a so-called probe) whereas the other variable carried a relevant signal. The signal data have been plotted in Figure 4.11. The left panel depicts the clean source signals and the right panel shows a noised version generated for participant 5. The clean source signals show that, shortly following the stimulus presentation (i.e., appraisal induction), there is a simultaneous response peak of motivation and expression, followed by a response peak of physiology. Response peaks were generated from the

flat-125

top window function,30 of which the motivation signal was a standard version, the physiology signal was a time-shifted version (by 40 steps), and the expression signal was an exponentiated version (hence the larger peak). These were the main challenges that we presented to ELSA, and required the model to handle both delay and nonlinear transformations.

Figure 4.11. Left panel: Source data for the synthetic dataset. Right panel: subject-modified version of the source data for participant 5.

In addition, we altered the source signal by participant with noise in order to mimic inter-participant variability, using four criteria:

Random amplitude: The height of the flat-top wave peak. The height was manipulated by scaling the basic flat top wave by a factor drawn from a continuous uniform distribution between [0.5,1.5].

Random shift: The number of steps with which the motivation and/or physiology series were shifted forward. The shift value was drawn at random and independently from discrete uniform distributions, [1,20] for motivation, and [41,60] for physiology, respectively.

30 We chose this waveform because window functions make it easy to generate waveforms of a desired width.

126

Random noise: An amount of independent Gaussian noise added to the source signals. The standard deviation of this noise was drawn at random from a continuous uniform distribution between [0.005,0.05].

Random artefacts: A time series of smaller randomly distributed flat top waves that have no relation with the manipulation or the main signals. In each segment of 25 steps, the probability of a random flat top wave occurring was 0.2.

The difference between the source signals and the noised versions generated for participants has been depicted for Participant 5 on the right panel of Figure 4.11. Remark that the appraisal time series was never altered from the source signal, since it always has to represent a fixed vector of design manipulations. The remaining three series were modified to exhibit an amplitude, shift, and noise that differed from the source signal.

A time series model can easily learn to ignore white noise such as the probe variables. A more difficult task is to ignore a non-random signal that is nevertheless not relevant for the target prediction.

This motivated the addition of the random artefacts for each participant, which are represented by the smaller peaks for Participant 5 in Figure 4.11). Artefacts often occur naturally when recording physiological- or motor data. For example, small movements from irrelevant muscles may cross-contaminate a measure of interest (e.g., eye blink). Overall, we considered our synthetic dataset to be a plausible approximation of what a real set of emotion recordings could resemble.

Data analysis. The data of all participants were concatenated into a data frame with 300 × 40 rows and 7 columns. The motivation, physiology, and expression time series were then centered to have mean 0 and scaled to have standard deviation 1. After scaling, the data were split into a training set (Participants 1 to 20) and a validation set (Participants 21 to 40). At no time were the validation data submitted to ELSA during training.

In order to find a model with the optimal set of control parameters, we performed a grid search on combinations of the following values:

 Interaction degree: none or two-way interactions

 Size of the LSM input weights: [1; 0.5; 0.1 ; 0.05; 0.01]

 Leaking rate of the LSM: [1; 0.9; 0.8; 0.7; 0.6; 0.5; 0.4; 0.3; 0.2; 0.1]

 Eigenliquid factorization method: PCA, SPCA, or ICA

In total, this made for 300 models to be fitted. The remaining control parameters of ELSA were held constant, with wavelet levels 1 through 8 retained, a liquid size of 600 units, an eigenliquid size of 200 units, and ridge regression for prediction. For each model, we logged the correlation between

127

observed and fitted values for the response time series, choosing as optimal control parameters the values that maximized this value across all response time series in the training data.

Fit results. The grid search over the control parameters of interest found an optimal ELSA model with interaction degree 2, liquid input weight size 0.1, a leaking rate of 0.2, and independent component analysis (ICA) for the eigenliquid factorization. In a final step, we refitted the best model with an upscaled liquid, consisting of 1000 units, without altering the number of eigenliquid units. The fit of this model is provided in Table 4.2. Before discussing these results further, it is worth commenting on the findings of the initial grid search.

Table 4.2. Correlation between observed and fitted values for ELSA’s predictions on the synthetic data.

Original: fit for the full 6000 training/validation data. Scrambled: fit for the full 6000 training data with reordered participants. Averaged: average fit across the 20 training/validation participants (300 time steps only).

Motivation Physiology Expression Probe Signal Probe Signal Probe Signal Training set (1–20)original NA 0.892 NA 0.791 NA 0.955 Training set (1–20)scrambled NA 0.889 NA 0.789 NA 0.955 Training set (1–20)averaged NA 0.962 NA 0.978 NA 0.992 Validation set (21–40)original NA 0.877 NA 0.501 NA 0.931 Validation set (21–40)averaged NA 0.958 NA 0.840 NA 0.986

Overall, it appeared that the physiology signal—which represented the delay challenge for ELSA—was the most sensitive to the correct setting of control parameters, with the worst correlation between observed and fitted, r = 0.40, and the best, r = 0.80. For the motivation and expression signals, even the worst correlation was at least r = 0.77. Modelling the association between the motivation and expression signals proved to be an easy task for the model to learn, even with the large amount of noise that we added to the dataset (probes, residual noise, artefacts). As expected, the fit was better when predicting expression from motivation than vice versa, that is, because exponentiation is not a fully reversible operation.

The size of the liquid input weights proved not very important for the synthetic data problem, with gains in correlations ranging between 0.01 and 0.03. The leaking rate of the LSM only impacted the fit of the physiology signal substantially, with the optimal leaking rate found at 0.2, suggesting the need for relatively slow decaying memory to handle the delay challenge in the data. The interaction degree did not impact the fit of ELSA substantially, like the size of the liquid input weights. Gains in correlation for the addition of 2nd order interactions ranged between 0.01 and 0.04. This suggested that—at least for this particular problem—the LSM was capable in itself to extract the necessary

128

interaction patterns, without an explicit push from the user. The choice of eigenliquid factorization, finally, showed by far the most dramatic impact on ELSA’s fit. Independent component analysis (ICA) was the overall winner, while spherical PCA performed the worst. Not only did the best model of the grid search utilize ICA, this eigenliquid factorization improved ELSA’s fit with gains in correlation ranging from 0.03 to 0.06. The downside to ICA’s good performance was its computational inefficiency compared to ordinary PCA, 7.3 versus 4.2 minutes to fit the same model, respectively.

Figure 4.127. Top row: Observed and fitted time series per emotion component signal, averaged across participants. Bottom row: observed-fitted moving correlation per 50 time steps per emotion component, averaged across participants.

Table 4.2 provides fit indices for training and validation data, in terms of the correlation between fitted and observed time series. A first remark should address the missing values for the so-called “probes”, which were white noise time series that were added to each response component set.

These correlation values were missing because the ridge regression models for the probes were all dominated by a single large intercept coefficient, and extremely small coefficients for the eigenliquid units (e.g., 1 × 10-40). As a consequence, the fitted time series were virtually flat, with little to no variance. Correlation cannot be computed when one or both variables lacks variance, hence the missing values for the probes. Put differently, ELSA correctly identified the lack of association

129

between the inputted signal data and pure white noise. Therefore we will ignore the probes for the remainder of this discussion and focus on the signal data.31

The first and fourth rows of Table 4.2 provide the most straightforward indices of training and validation fit, as applied to all 6000 observations of each dataset. For the motivation and expression signal, ELSA’s performance was virtually as good on the training data as on the validation data. A larger discrepancy was observed for the physiology signal, with rtrain = 0.791, and rval = 0.501, which suggested that the delay task was quite challenging to the model. However, an inspection of ELSA’s fit by participant revealed an interesting result. For some participants, ELSA predicted the signal peak extremely well, with correlations going up to r = 0.883. For those participants who had a low—even negative—correlation between observed and fitted values, however, it appeared that the model had succesfully predicted the signal peak but simply too soon or too late. A simple measure of concurrent correlation (i.e., cor[Xt,Yt]) overlooked this relationship where lagged correlation (e.g., cor[Xt–20,Yt]) would have detected it. This finding led us to reconsider our measure of fit. Rather than looking at ELSA’s fit for the entire validation time series of 6000 observations, we averaged observed and fitted values across participants, and calculated correlation between these participants-averaged time series.

Table 4.23 and Figure 4.12

A final validation test that we submitted to ELSA was to reorder the sequence of participants in the training data, and calculate ELSA’s fit for these “scrambled” data, without rerunning parameter estimation (Table 4.2, second row). This reduced the observed-fitted correlation by 0.03, at the most.

We concluded that ELSA’s performance did not simply depend on the original ordering of participants.

Figure 4.13. Cumulative observe-fitted correlation per emotion component signal. Emotion components were added to ELSA in a forward stepwise manner, that is, the component with largest predictive power first, then the next, etc.

31 In fact, based on these results one might wish to refit ELSA and drop the probes altogether.

130

Extended diagnostics and synchronization. How did the four emotion components interact and contribute to ELSA’s final fit? This question could be answered with a simulated deletion analysis, that is, we deleted connections in ELSA’s network component by component, in an experimental fashion, and assessed the relative decrease in fit. The full deletion analysis took 28 minutes to complete. Figure 4.13 summarizes the most important results, depicting the cumulative contribution of each emotion component in ELSA for predicting the motivation, physiology, and expression signals, when added in a forward stepwise fashion. For example, the motivation signal was quite well predicted when using only time series of the expression component as input, accounting for r = 0.830. Adding time series of physiology and, subsequently, time series of appraisal, improved the fit by a mere 0.04 and 0.01, respectively. A similar finding was obtained for the expression signal, with the motivation time series accounting for the largest increase in fit, up to r = 0.807, and a subsequent adding of physiology and time series improving the fit minimally. For the physiology signal, the motivation and expression time series accounted about equally for the final fit, whereas appraisal, again, added no substantive improvements. Overall, the results of the deletion analysis reflected well the associations that we constructed in the synthetic data.

Figure 4.14. Complete transient synchronization on ELSA’s eigenliquid data for the synthetic dataset, on a moving window of 50 time steps.

A synchronization analysis was conducted for the validation data on a moving window of 50 time steps. We chose this size because 50 steps was the width of the flat-top window signal in our source data. We calculated both partial transient synchronization (moving observed-fitted correlation) and complete transient synchronization (epsilon). As before, we averaged the obtained values across participants. Results for partial transient synchronization are depicted in the bottom row of Figure

131

4.12. Clearly, the observed and fitted values for all three response signals synchronized more heavily during the peak of the flat-top function. Outside of this range, the correlation oscillated around 0.

Results for complete transient synchronization have been plotted in Figure 4.14. Again, we first calculated the epsilon value across the full time series of 6000 points and then averaged the obtained values across participants. In addition, we compared the epsilon value between the training and the validation data. A zone of interest was demarcated from time step 150 to 250, as this represented the extent of the emotion component signals, starting with appraisal, at the earliest, and ending with physiology, at the latest. As Figure 4.14 shows, the epsilon measure increased substantially in this region, returning to its baseline activation rather slowly. The degree of synchronization was more or less sustained throughout the zone of interest, despite variable delay between the three response signals. There appeared to be no meaningful difference between the training and validation synchronization for these data. Note that the value of epsilon during the baseline phase (i.e., timestep 1 to 150) of the signal oscillated on average around 0.38 and 0.40. This is much higher than a value of 0 that might be expected for desynchronized signal phases, and that we did observed for the moving cross-correlation measures (Figure 4.14, bottom row). The presence of random artefacts may have contributed somewhat to a baseline inflation of synchronization but it should also be noted that the theoretical lower bound for epsilon is not actually 0 (see supplementary material). As such, these values need not be abnormal.

We did not create a separate time series for the feeling component for our synthetic data, hence a feeling analysis could not be conducted. However, one might imagine feeling data to possess somewhat similar characteristics as the synchronization curve plotted in Figure 4.14, with a fast onset shortly after stimulus presentation (i.e., emotion elicitation), a period of persistence, and a slow offset.

Discussion. Overall, ELSA performed extremely well on the synthetic data. The model was capable of modelling the three signal time series—one of which was lagged—while handling varying shifts, varying noise, varying amplitude sizes, probe variables, and noise artefacts on the data. Indeed ELSA’s performance was considered outstanding, given all of these hurdles. Overall, the model was not only capable of extracting the target patterns from the data but also proved fairly robust against overfitting. Recall that the ridge regression implemented in the glmnet package performs an internal cross-validation on the training data (see Section 4.4.4), which means that the MSE or R2 of an ELSA model for training data can be regarded as a kind of MSECV or 𝑅𝑐𝑣2 . For the synthetic data, we verified that the training fit was indeed highly similar to the validation fit, indicating that the cross-validation procedure resulted in a satisfactory generalization performance. The analysis of the synthetic provided useful information on the optimal strategy for fitting an ELSA model. The supplementary material outlines a practical procedure that we recommend for this purpose, and which we largely applied for the simulation that we conducted on the ERP data in this paper. Synthetic data cannot mimic the properties and complexities of real data, however. In the next, sections, we apply ELSA to recordings from published appraisal studies.

132