• Aucun résultat trouvé

1.8 Simulation study

1.8.1 Simultaneous Equations Model (SEM)

Proposition 1.7.2 shows that the new Robust GMM estimator does not introduce any asymptotic bias in the estimation, keeps the consistency property in moment selection when the underlying distribution holds true, and has a bounded influence function for both valid and invalid moments under contamination.

1.8 Simulation study

1.8.1 Simultaneous Equations Model (SEM)

The purpose of this section is to show how MSC and RMSC perform under contamination in a linear SEM setting. The model is given by:

y1i =β01+β11x11i +β21x21i+β31x31i+β41y2i+u1i

y2i =β02+β12x12i +β22y1i +u2i, (1.19)

where β01 = β11 = β21 = β31 = β41 = β02 = β12 = 1 and β22 = 2. Moreover, u1i and u2i follow a bivariate normal distribution with ujiN(0,1), j = 1,2, E(u1iu2i) = 0.5,

1.8. Simulation study 25 and E(ujiulm) = 0, ∀j, l = 1,2 and i 6= m. On the other hand, x11i, x12i, x13i and x21i

are generated from N(0,1) independently for each i with the covariance between any pair of covariates set to 0.5. We assume that we are only interested in estimating the effect of y1i on y2i, hence on the estimation of equation (1.19). For this purpose, we can use three instruments which lead to consistent estimates. We generated two potential additional instruments, l1 andl2, which are correlated with bothu1 andu2 and that lead to inconsistent estimates. Since the number of possible set of instruments increases very quickly with the potential instruments we can use, we make some assumptions concerning the available information. We assume that we have perfect knowledge of the functional form of the model (1.19). The possible sets of instruments for y1i we consider are:

˜

w1i = [x11i x13i], w˜2i = [x12i x13i], w˜3i = [l1i x13i], w˜4i = [l2i x13i] w˜5i = [x11i x12i x13i], w˜6i = [x11i l2i x13i]

˜

w7i = [l1i x12i x13i], w˜8i = [l1i l2i x13i].

Among the 8 sets of instruments, three, namely ˜w1i,w˜2i, and ˜w5iyield consistent estimates.

Moreover, ˜w5i gives rise to the most efficient estimator among the consistent sets. In the benchmark case, we setE[ljiuki] = 0.5 ,∀j, k. We compute the classical GMM through the moments EP0[ ˜wsTu2] with the asymptotic weight matrix given byWjopt =EP0[u22w˜sw˜sT]−1, s = 1, ...,8. In addition, we compute two RGMM estimators, one withc1 = 3 and c2 =∞, RGMM1, and RGMM2, withc1 = 3 andc2 = 7. We perform the simulations in R with the optimizer optim with the default settings. We use a 2-stage M-estimator with bounded IF as an initial value for both RGMM1 and RGMM2.

RGMM1 only ensures boundedness of the IF for valid orthogonality conditions. It is the analogue of the robust GMM developed by [Ronchetti and Trojani, 2001] in this setting. RGMM2 ensures boundedness of the IF for both consistent and inconsistent moments. c2 was fixed given c1 so as to have a 90% relative efficiency in terms of MSE with respect to the classical GMM in the case of no contamination. Tables 1.3 to 1.6 report

the frequencies of choosing the set of moments that maximizes the number of orthogonality conditions that are asymptotically zero for two different contaminating distributions, two sample sizes (100 and 250) and, different amounts of contamination (ε ranging from 0% to 5%). In the case of no contamination, ε = 0, we observe a similar behavior for MSC, RMSC1 and RMSC2. The consistent procedures choose d0 with around 0.93 and 0.96 probability for n = 100 and n = 250 respectively. However, as we start adding contamination, the MSC performance deteriorates whereas RMSC1 and RMSC2 remain stable. For example, in the worst case scenario, y1 ∼ 0.95P0 + 0.05N(0,10) and n=100, the probability of choosing d0 drops by 50%, from ε = 0 to ε = 0.05, in the classical case and remains constant for the robust procedures. Surprisingly, the probability of choosing a set of moments that include invalid moments goes from 0%(ε = 0) to more than 50%(ε=0.05) for classical procedures. In contrast to this result, the robust procedures choose a set with invalid moments with around 5% probability in the 5% contamination scenario. What we are showing with this simulation setting is that even if the data follows the DGP most of the time, classical procedures are so affected by small departures from the reference distribution that they are not able keep the properties they have in the no contamination case. Tables 1.7 and 1.8 show the behavior of MSC and RMSC when the underlying distribution features heavy tails. Indeed, we simulated data from the same model as in the contamination setup but this time we assumed that u1i and u2i followed a t-distribution with degrees of freedom ranging from 1 to 5, u1i, u2it% with%= 1, ...,5 and two different sample sizes (n=100 and n=250). All the other assumptions were kept identical with respect to the previous case. The classical IV is not asymptotically defined whenu1i, u2it1. This was reflected in the simulations as we experienced some numerical problems with the classical estimator. Interestingly, the RGMM does not suffer from that problem as it still exists even when u1i, u2it1. This difference for both estimators was also reflected in the behaviour of MSC and RMSC. Indeed MSC in general pickedd0 with a probabiliy lower than 40% across all procedures for %= 1. Moreover, the probability of choosing a set of invalid instruments is higher than the one for pickingd0when%= 1. This finding does not hold as %increases. In fact, when%= 5 MSC seem to behave similarly to

1.8. Simulation study 27 the case of errors following a normal distribution with no contamination. RMSC display a very steady behavior for all different values of% considered, thus, showing the superiority of RMSC over MSC in the presence of heavy-tails without contamination. The latter could be a major advantage in some settings where we expect the data to feature heavy tails.

From an empirical perspective, the fact that classical Moment Selection Criteria can be drastically perturbed by small contaminations might not be crucial as long as MSE as-sociated with MSC does not vary much across different contaminations. Thus, we also compute the post-selection MSE of the estimates for MSC, RMSC1 and RMSC2. Ta-bles 1.8 to 1.12 provide the MSE across different scenarios, contaminating schemes and sample sizes. We observe a slightly more efficient behavior of the classical estimators with respect to the robust ones. This observation is expected as in theory we lose some efficiency in order to prevent large losses in MSE in the presence of small contamination.

As we increaseε the efficiency of the classical estimators drops dramatically, whereas the robust estimators are much less sensitive to this feature when the underlying distribu-tion is different from the model. For illustradistribu-tion purpose, let us take the MSE of β22 for MSC-BIC, RMSC1-BIC and RMSC2-BIC. In the case with no contamination the classical estimator is around 25% more efficient than RMSC1-BIC and RMSC2. However, as we add some mild contamination (2%) the efficiency drops and the classical is no longer more efficient. In fact, it is about 4 and 6 times less efficient than RMSC1-BIC and RMSC2-BIC respectively, for n=100 and y1 ∼ 0.98P0 + 0.02N(0,10). These results clearly show the poor performance of classical estimators even in the presence of small contaminations.

Although we do not claim that we should systematically use robust estimators instead of classical ones. Tables 1.13 and 1.14 show the MSE of the estimates associated with MSC and RMSC when u1i and u2it%, % = 1, ...5 and two different sample sizes (n=100 and n=250). The behavior of MSC and RMSC are very different especially when %= 1. This is explained by the fact that the classical estimator is not asymptotically defined in this case. RMSC is superior to MSC in almost all cases across all different procedures. There are no significant differences across methods for RMSC. It is clear that robust methods

guarantee stability to the statistical procedure and are useful for detecting structures or potential problems in the data. Their use becomes even more important in multidimen-sional settings, where it is near impossible to plot the data in order to look for possible structures.