• Aucun résultat trouvé

3. C AUSAL I NFERENCE AND M ETHODOLOGY

3.3 C AUSAL I NFERENCE WITH P ANEL D ATA

As already mentioned, in this thesis, we use the data of the Swiss Household Panel (SHP) between 1999 and 2011 to estimate the causal impact of union membership on the series of attitudes we take into account. In this section, the goal is to recast the previous discussion on causality and on the endogenous character of the union membership variable in a panel data setting. We will high-light the differences with the cross-sectional case and show the advantages offered by a longitudinal perspective.

3.3.1CAUSALITY AND ENDOGENEITY WITH PANEL

DATA

For our purposes, in a panel data setting, the cross-sectional equa-tion 3.7 can be rewritten as follows:

𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊𝒕 = 𝜷 𝒖𝒏𝒊𝒐𝒏𝒊𝒕+ 𝝂𝒊 + 𝝁𝒊𝒕 , 𝒇𝒐𝒓 𝒊 = 𝟏, 𝟐, … , 𝑵 𝒂𝒏𝒅 𝒕 = 𝟏, 𝟐, … , 𝑻 (𝟑. 𝟏𝟒)

Two main differences are apparent. The variables in 3.14 present two sub-scripts (i for the individual i and t for the time period t).

In fact, this is the distinctive feature of panel data: the same N individuals are followed for T time periods.

The second difference lies in the presence of two errors terms instead of one. The last term, μit, represents all variables not in-cluded in the model that affect the dependent variable and vary across individuals and over time. It is called idiosyncratic or time-varying error. The first error term, νi, corresponds to all variables that affect the dependent variable that vary across individuals, but

that are fixed over time (hence the absence of the time sub-script).

It is also called fixed effect. It includes the constant, observable variables (if they are not included in the model) such as sex, edu-cation (if completed), parent's eduedu-cation,... but also unobservable variables such as innate predispositions towards some behaviors or attitudes. Its presence and, most importantly, the fact that we can estimate the parameters we are interested in after having got rid of it, is one of the main advantages panel data offer. The par-ticular type of dependent variables we take into account makes it even more important. In fact, the potential variables that influence an attitude are probably to a large extent composed of unobserved individual characters that make the individual inherently more or less inclined towards one extreme of the attitude under examina-tion. As our example in sub-section 3.1.2 shows, this unobserved heterogeneity may be correlated with the union membership vari-able, generating thus a problem of omitted variable bias. Even when using an IV estimator, considering the variety of variables that can be comprised in this category, it is difficult to exclude the absence of a correlation between these variables and the instru-ment. This makes the assumption of absence of correlation with the error term difficult to justify.

For these reasons, we would like to estimate the causal impact of union membership by excluding from the error term these un-observed fixed effects. By exploiting the time-invariant nature of the fixed effects, it is possible to think of a variety of transfor-mations to get rid of it by modifying equation 3.14. Here, we adopt the first-differencing procedure that consists of taking the differ-ence between the equation for the time period t and the same equation at a previous time point (usually t-1). Hence, for a varia-ble xit, first-differences are defined as:

∆𝒙𝒊𝒕 ≡ 𝒙𝒊,𝒕− 𝒙𝒊,𝒕−𝟏 , 𝒇𝒐𝒓 𝒊 = 𝟏, 𝟐, … , 𝑵 𝒂𝒏𝒅 𝒕 = 𝟐, … , 𝑻 (𝟑. 𝟏𝟓)

The t sub-script starts from 2 since for the first period there is no (t-1) period to subtract. This procedure is called first-differencing,

but in section 3.4 we describe also the usefulness of using differ-ences of higher order. We use the differencing transformation be-cause it makes the interpretation of some assumptions we will make on instrumental variables easier to interpret and it offers some useful properties we describe in sub-section 3.4.1. Differ-encing equation 3.14 gives the following first-differenced expres-sion:

∆𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊𝒕 = 𝜷 ∆𝒖𝒏𝒊𝒐𝒏𝒊𝒕+ ∆𝝁𝒊𝒕 , 𝒇𝒐𝒓 𝒊 = 𝟏, 𝟐, … , 𝑵 𝒂𝒏𝒅 𝒕 = 𝟐, … , 𝑻 (𝟑. 𝟏𝟔)

The fixed effects have disappeared from the equation. The β coefficient is the same as in the baseline equation 3.14, but, in equation 3.16, its estimation relies only on the variation of the un-ion membership status experienced by each individual. In other words, the coefficient is estimated only by using the variation within each individual, while the variation across individuals has been excluded through the differencing procedure. We explain a change in the dependent variable through a change in the inde-pendent one(s). In such a setting, all variables that do not vary over time are excluded from the estimation. For example, it is not possible to estimate the impact of sex on the attitude under exam-ination. Although there are ways to circumvent this limitation, we omit to cite them because the key independent variable we are in-terested in, union membership, is varying. But even for time-varying variables, in order to have precise estimations, it is needed that a sufficient number of individuals experience a variation. This is the price to pay for eliminating the time-invariant heterogeneity.

We exclude a source of bias, but the estimation through sample data becomes less efficient. In our case, the coefficient of the un-ion membership variable is highly dependent on the number of individuals that experience a transition from “Non-member” to

“Member” or vice versa (our analyses are restricted to specific un-ion membership transitun-ions, cf. sub-sectun-ion 3.4.2).

Having converted the cross-sectional model into a longitudinal one, the question to be answered is: under which conditions esti-mating equation 3.16 through OLS allows getting an unbiased es-timation of the causal effect of union membership on the consid-ered attitude? In other words, what is the panel data equivalent of the conditional independence assumption stated in equation 3.8?

On first-differenced data, the conditional independence assump-tion in a panel data setting is stated as follows (Wooldridge 2010:315–318):

𝒄𝒐𝒗(∆𝒖𝒏𝒊𝒐𝒏𝒊𝒕, ∆𝝁𝒊𝒕) = 𝟎 , 𝒇𝒐𝒓 𝒂𝒍𝒍 𝒊, 𝒕 (𝟑. 𝟏𝟕)

This means that the variation in the union membership variable has to be uncorrelated with the variation in the idiosyncratic error.

Equivalently, equation 3.17 holds if the union membership varia-ble is uncorrelated with the idiosyncratic error term in the two points in time that are implicated in the first-differencing transfor-mation (μit and μi,t-1). Hence, this assumption rules out the corre-lation of union membership with the present or the one period ahead value of the idiosyncratic error. In the latter case, it is ruled out an impact of current changes in the idiosyncratic error on fu-ture values of the union membership variable, as it would be if unionit was a lagged dependent variable.

Before describing when this assumption may be violated, it is useful to represent equation 3.16 in a path model (Figure 3.5, on the next page):

Figure 3.5: Path model representing the effect of union membership on a particular attitude in differenced form

Not surprisingly, assumption 3.17 is violated in two situations (represented by the two bold arrows in figure 3.5) similar to those described in the cross-sectional case in sub-section 3.2.1. Assump-tion 3.17 is violated if there are time-varying omitted variables that are correlated with the variation of the dependent variable and the variation in the union membership variable. For example, consid-ering again job satisfaction as attitude, an increased education level accompanied by a change of the position occupied at the work-place may represent such an omitted variable. Occupying a new position at work can obviously influence job satisfaction and at the same time increase the chances of becoming a union member (because, as we have seen in the previous chapter, in Switzerland, a higher level of education implies a higher probability of being a union member). Reverse causality is also an issue. Following the previous example, a change in job satisfaction, as a consequence of the deterioration of working conditions, may lead an individual to become a union member, in the hope for improving his situa-tion through the union bargaining activity. The solusitua-tion to these two problems is again the use of an instrumental variable.

We have introduced panel data as a means to improve the anal-ysis in a cross-sectional setting. We see that, however, the two is-sues that interfere with causal inference in the cross-sectional case

appear also in the longitudinal case. Therefore, is there a real gain for using panel data? Indeed, there is. Even though the union membership variable is still endogenous, having eliminated the fixed effects from the estimation represents a huge improvement in this regard. The correlation between the error term and union membership in the cross-sectional setting could be generated by any kind of variable influencing both the dependent variable and union membership. In equation 3.16, we have excluded a huge proportion of the variables that could potentially bias the estima-tion. More precisely, we have got rid of all time-invariant factors.

Among these factors, we cannot find all the innate predispositions that can potentially be correlated with an attitude and union mem-bership. Moreover, it is reasonable to think that the majority of individual predispositions, objective conditions,... that could be possibly seen as relevant omitted variables have a time-invariant nature. Most dimensions of an individual's life remain constant over time or do not change very frequently. Our concerns are now restricted solely on time-varying factors and these are much less numerous than the time-invariant ones. Also, we can suppose that individuals are not pre-programmed to experience some changes, especially if they are already adult wage-earners as in the popula-tion we examine here. If an individual shows a variapopula-tion from one period to another, it is very likely that such variation is related to external, objective, observable events. Identifying such events is the key aspect to evaluate the endogeneity of a variable in an equa-tion like 3.16. The sources of bias to control for are much less various. After having chosen an instrument, it is much easier to rule out its correlation with a restricted set of possibly relevant omitted time-varying factors.

3.3.2FINDING A VALID INSTRUMENT

In a panel data setting, what kind of parameters could represent good instruments for union membership as an explanatory varia-ble of the attitudes we will analyze in the following chapters? For

each instrument, we should think about the possible direct corre-lation with each attitude considered. An instrument that works fine for a certain attitude is not necessarily valid for other types of attitudes. Although we give a more detailed account in the follow-ing chapters, we describe here the basic reasonfollow-ing used to choose the instruments we use subsequently and motivate theoretically why they should be valid.

Finding a good instrument is not easy. The best strategy in our case is to think about the processes, the mechanisms that lead an individual to become a union member. As we already noted, union membership can be regarded as an aspect at the intersection of two life domains: the professional sphere and the social sphere.

Regarding the professional sphere, what are the reasons that lead an individual to become a union member? There are many possible explanations for such a choice: a change of job that leads to a workplace where a good proportion of individuals are mem-bers, a deterioration of the objective working conditions that leads the individual to see union membership as a possibility to defend his rights, increased contacts with coworkers already members, the recruitment activity of unions,... All these events seem more or less random and it is not easy to see how one of them could be used to find an instrument for union membership. The key idea here is that, although these events are random, they become more likely in certain settings. In particular, we can think that an indi-vidual has higher chances to become a union member if the geo-graphic region or the working sector to which he belongs is char-acterized by an important union density.

The geographic region should somehow reflect the union tra-ditions of individuals leaving closely to each other and having the chance to exert reciprocal influences. The question is: on which geographic level should we focus? The best choice is a geographic level that it is not too broad in order to allow some geographical traditions to matter. Also, it would be good if the geographic level reflected also a legislative one since union membership is usually linked to legislative practices making union membership more or

less favored (even though the freedom of union membership and non-membership is constitutionally guaranteed in Switzerland).

Such a level in Switzerland can be represented by cantons, which are usually not too large entities, have a large legislative autonomy and present also strong unitary traditions.

Concerning the working sector, it makes sense to think that the economic sector (defined according to the NOGA classification in Switzerland, which is compatible with the European NACE classification (Swiss Federal Statistical Office 2014)) represents a very important determinant regarding the chances of becoming a union member. In fact, as described in chapter 2, the same unions are usually active in the same sectors across different cantons. The working sector to which one belongs should thus be correlated with the union recruitment activity for new members and also be related to specific common views regarding union membership.

Which kind of instruments should we use given the considera-tions of the last paragraphs? We could take the canton and the economic sector as instruments, but then we would still have doubts about a possible direct effect of the instrument on the dif-ferent attitudes because each canton or sector may show particular attitudinal tendencies that distinguish it from others. Also, as dis-cussed in the next sub-section, the strength of the correlation of these instruments with the union membership variable may also not be very high (indeed, our preliminary tests show that they would be quite weak instruments). Instead of taking these varia-bles directly as they are, we decide to take the feature we are inter-ested in, i.e. the union density by canton and by sector. The in-struments are constructed as an aggregation of the data we have on union membership. This is a procedure known in the literature trying to estimate “peer effects” of an aggregated variable on the same variable at the individual level. As the literature on “peer ef-fects” shows (Angrist and Pischke 2009), estimating the causal im-pact of a variable that is the aggregation of another one is a com-plicated enterprise. Here, however, we are not interested into es-timating the causal impact of the aggregated union membership on the individual chances to become a union member. We only

want to make sure that the aggregated variable is correlated with the individual variable and rule out any direct correlation with the individual attitudes we take as dependent variables. In our setting, this strategy has the big advantage of giving us very strong instru-ments (as we discuss in the next sub-section). In order to increase the strength of the instruments even more, we also cross the can-tonal and sectorial union densities with the individual occupation type (full-time or part-time) since we know that full-time working individuals have higher chances of union membership than indi-viduals in other working situations. For each year taken into ac-count in the panel data, our first instrument corresponds to the union density computed according to the occupation and the ton in which an individual lives (it would be better to use the can-ton in which the individual works, but we do not have information available on that. The residence canton can anyway be considered as a good proxy of it). For example, a particular value of the in-strument is given by the union density in 1999 for full-time work-ers living in Zurich. The second is constructed the same way by replacing the residence canton with the economic sector NOGA in which an individual works. In the construction of both instru-ments, we use cross-sectional weights.

Regarding the social dimension of union membership, we can construct a valid instrument for union membership when consid-ering job attitudes as dependent variables. Thinking of unions as being one of the many organizations to which an individual can belong, we can suppose that the more associations an individual is member of, the higher is the chance that unions could also be included among those associations. Using the variables available in the SHP about the membership in different types of associa-tions or organizaassocia-tions (associaassocia-tions of parents, sports or leisure organizations, cultural associations, political parties, associations active for the protection of the environment, associations defend-ing women rights and associations promotdefend-ing tenants' rights), we construct a variable that indicates the number of different associ-ations, except unions, to which an individual belongs. This num-ber should be correlated with the union memnum-bership variable and

show no direct effect on job attitudes (as we will argue in chapter 4).

3.3.3EVALUATING THE QUALITY OF AN INSTRUMENT As it is the case with every instrument, the exogeneity assumption of the instruments we described is never formally testable. More-over, even with an instrument that actually does not have any cor-relation with the error term, it happens that, due to sampling error, a correlation appears in finite sample data. We cannot exclude that our instruments also show some correlation with the error term.

We can argue theoretically that we are pretty confident about the

“almost exogeneity” of them, we will give some empirical evi-dence to support our reasoning in the next chapters, but our ar-guments are never going to be enough to completely close the door to potential critics. What we can do, on the other hand, is to try to study what would happen if our instruments showed indeed some correlation with the error term and understand to what ex-tent the results of the IV estimation would be biased. Referring to the model in equation 3.16 in differenced form, it can be shown that the bias of an IV estimator is equal to (Wooldridge 2013:414):

𝒑𝒍𝒊𝒎𝜷̂𝑰𝑽− 𝜷 = 𝒄𝒐𝒓𝒓(𝑰𝑽, ∆𝝁) 𝒄𝒐𝒓𝒓(𝑰𝑽, ∆𝒖𝒏𝒊𝒐𝒏) 𝝈∆𝝁

𝝈∆𝒖𝒏𝒊𝒐𝒏 (𝟑. 𝟏𝟖)

where σΔμ / σΔunion are the population standard deviations of Δμ and Δunion. We see that the bias is large when the correlation between the instrument and the instrumented variable is small.

This is the so called problem of “weak” instruments. Even if the instrument is not correlated with the error term in the population, the small correlation that appears necessarily due to sampling er-ror is capable of severely biasing the whole estimation when the instrument is weakly correlated with the instrumented variable. As a rule of thumb, an instrument is said to be strong when the

F-chapter 12). Our instruments are indeed very strong, giving a first stage F-statistic well above 10. With a strong instrument, we can even tolerate the presence of a small correlation with the error term since the strength of the instrument reduces the bias induced by it. Our preliminary tests also show that, for the two types of union density cited before, using them as instruments in their level form gives a higher correlation with the variation in the union membership status than in differenced form. This is one reason

F-chapter 12). Our instruments are indeed very strong, giving a first stage F-statistic well above 10. With a strong instrument, we can even tolerate the presence of a small correlation with the error term since the strength of the instrument reduces the bias induced by it. Our preliminary tests also show that, for the two types of union density cited before, using them as instruments in their level form gives a higher correlation with the variation in the union membership status than in differenced form. This is one reason