• Aucun résultat trouvé

3. C AUSAL I NFERENCE AND M ETHODOLOGY

3.1 C AUSALITY AND L IVING WITH O BSERVATIONAL

D

ATA

3.1.1THE CONCEPT OF “CAUSAL EFFECT

The declared objective of this thesis is to examine the existence of a causal relationship between union membership and various di-mensions of job, political and other-regarding attitudes. What does exactly mean that “union membership has a causal effect on a particular attitude”? One way to explain why union membership may be seen as a “cause” of an attitude would be to find a mech-anism, a chain of successive events, each one considered as the cause of the successive one, leading from the fact of becoming a union member to a change in the attitude considered. This is usu-ally the way one begins postulating the existence of a causal rela-tionship. For example, union membership may have a causal im-pact on job satisfaction because, through the bargaining activity of trade unions, a member may enjoy better objective working conditions than a non-member and thus be more satisfied with his job situation. Although useful on an intuitive level, this strategy does not solve the problem. It only shifts the issue from the

tionship between union membership and the attitudes to the rela-tionship between pairs of successive events (union membership being the first one and the attitude the last one). One is then still asked to define the meaning of the causal relationship between each pair of events.

For simplicity, we suppose that the attitude taken into account is expressed in a numeric scale. Formally, for an individual i, we define the causal effect of union membership on an attitude as the difference between the attitudinal level declared by the individual as union member and the attitudinal level declared by the individ-ual as non-member, holding all other conditions fixed:

𝜟𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 = (𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊) – (𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊) (𝟑. 𝟏)

Usually, we are not interested into estimating the causal effect on a single individual, but into determining the average causal ef-fect on the whole population of interest. In our case, the popula-tion of interest includes all wage-earners, i.e. the individuals that could potentially become union members. For the average causal effect, equation 3.1 becomes:

𝑬[𝜟𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊] = 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] – 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] (𝟑. 𝟐)

This “average treatment effect” (ATE) represents the causal effect of union membership on the attitudinal level of the “average indi-vidual”, the “typical individual” in the population of interest.

The previous characterization is the so called “counterfactual”

definition of causality. It is a very useful theoretical starting point to conceptualize causality, but it cannot be directly used to “meas-ure” the causal impact of union membership on an attitude. In fact, for each individual, we either observe the attitudinal level in

the attitudinal level in a situation in which he is not a union mem-ber (attitudei | non-memberi), but never both at the same time.

The condition “both at the same time” is crucial because the only way we can make sure that the two events are compared holding all factors other than union membership fixed (ceteris paribus as-sumption) would be to go back in time and observe the attitudinal outcome on every individual i in a counterfactual setting. This re-sult is known as the “fundamental problem of causal inference”

(Holland 1986). It can be seen as a missing data problem. For the individuals that are members, we would like to observe a counter-factual setting in which they are non-members. For those that are non-members, we would like to observe a counterfactual setting in which they appear as members. To make this more clear, we can rewrite equation 3.2 by decomposing its right-hand side into two components, one representing the average treatment effect on the treatment group (those observed as union members) and the other representing the average treatment effect on the control group (those observed as non-members):

𝑬[𝜟𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊] = 𝑬[𝑬[∆𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓] + 𝑬[∆𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓]] = 𝑬[(𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] – 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊]) + (𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] – 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊])]

(𝟑. 𝟑)

where the state that follows the membership symbol “ϵ” is the observed one. In words, the average treatment effect is equal to the average of the sum of the average treatment effect on the treated and the average treatment effect on the non-treated. The two terms in italic are the unobserved ones. We have an equation composed of four terms, two of which are observed. Can we hope to get a good estimation of the overall average treatment effect using only the two observed terms? This would lead us to com-pute an observed average treatment effect as the difference be-tween the average value for the treated in their observed treatment

state and the average value for the non-treated in their observed non-treatment state. In equation form, this corresponds to:

𝑬[𝜟𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅]

= 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] – 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] (𝟑. 𝟒)

This equation represents the observed average attitudinal differ-ence between union members and non-members. Under which conditions does the observed difference in equation 3.4 equal the true average treatment effect in equation 3.3? To identify them, with some algebraic operations, it is possible to rewrite equation 3.4 as (Winship and Morgan 1999:667):

𝑬[𝜟𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒐𝒃𝒔𝒆𝒓𝒗𝒆𝒅] = 𝑬[∆𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊] +

(𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] – 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊]) + (𝟏 − 𝝅) ∗

((𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] − 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊])

− (𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] − 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 | 𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊])) (𝟑. 𝟓)

where π is the proportion of individuals belonging to the treat-ment group. In words, the observed average attitudinal difference between treatment and control group is equal to the sum of the true average treatment effect and two other terms. These two terms represent the two possible sources of bias that would lead the observed difference between treatment and control group to differ from the true, counterfactual average treatment effect. Each of the two biases can be ruled out when specific assumptions are satisfied.

3.1.2LIVING WITH OBSERVATIONAL DATA:THE

SELECTION BIAS

Regarding the first source of bias, it is usually called “selection bias” (Angrist and Pischke 2009). It corresponds to the “baseline difference” we would observe in the outcome variable if we could observe the level of the outcome variable in the non-treated state for the treatment group (unobserved parameter) and the level of the outcome variable in the same state for the control group (ob-served parameter). This means that the average treatment effect based on the observed parameters may be biased because the mean level of the outcome variable in the non-treated state for the treatment group may differ from the one we observe for the con-trol group in the non-treated state.

Considering again as an example union membership as the treatment and job satisfaction as the outcome variable, we can make a hypothesis (that we will confirm in the fourth chapter) on why this type of bias may affect the estimation of their causal re-lationship. We suppose that union members, even before joining a union, are individuals with a lower average job satisfaction (this is indeed the case, as we will see in the next chapter) than non-members. In fact, their lower job satisfaction is one of the reasons that may lead them to join a union, hoping for an improvement of their objective working conditions. Now, if that is true, using the observed parameters to estimate the causal effect of union membership on job satisfaction may lead to infer a negative im-pact of union membership even though the true causal effect was positive. To see that, suppose the average job satisfaction level for future union members is 6 before becoming members (unob-served parameter) and 7 after becoming members (ob(unob-served pa-rameter). Suppose also that the average job satisfaction level for individuals that would not become members is 8 before joining a union (observed parameter) and 9 afterward (unobserved param-eter). In this case, the true causal effect is a net increase of one point in job satisfaction (it is simple to verify it by plugging the four values into equation 3.3). However, if we computed this ef-fect only through the two observed parameters, we would be led

to conclude that union membership decreases job satisfaction by one point (it is easy to verify it by plugging the values of the two observed parameters in equation 3.4).

Where does the selection bias come from? It is a consequence of the non-random assignment of the treatment variable condi-tional on the values of the outcome variable. Formally:

𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒎𝒆𝒎𝒃𝒆𝒓 |𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] ≠ 𝑬[𝒂𝒕𝒕𝒊𝒕𝒖𝒅𝒆𝒊 𝝐 𝒏𝒐𝒏−𝒎𝒆𝒎𝒃𝒆𝒓 |𝒏𝒐𝒏 − 𝒎𝒆𝒎𝒃𝒆𝒓𝒊] (𝟑. 𝟔)

In other words, the average “baseline difference” between the treatment and the control group is a consequence of the fact that the chances of being selected into the treatment group are corre-lated with the outcome variable. This is a violation of the “condi-tional independence” assumption. As we will see, to solve it, the best way is to randomly assign the treatment variable in an exper-imental setting. However, an experexper-imental setting being rarely available to the researcher, it is possible to accommodate the con-ditional independence assumption by controlling for observable covariates (“selection on observables”) responsible for the corre-lation between the treatment and the outcome variables. We de-scribe this procedure in sub-section 3.2.1.

3.1.3LIVING WITH OBSERVATIONAL DATA:THE

“HETEROGENEOUS EFFECTS”BIAS

As far as the second source of bias goes, it is composed of the product of the proportion of those in the treatment group and the difference between two expressions. The first expression repre-sents the average treatment effect on the treatment group and the second one the average treatment effect on the control group.

Hence, this bias appears whenever the average causal effect of the treatment variable differs in the treatment and the control group.

To what extent is it reasonable to consider that the average causal effect of union membership on a specific attitude is homog-enous among union members and non-members? If we could bring the wage-earners non-members to become union members, would the observed causal impact, if any, be equal to the one ob-served for actual members? Empirically, this is an untestable ques-tion since we cannot observe the causal effect on non-members and compare it with the one we observe for members. However, we can give some theoretical arguments that lead us to believe that the answer is rather negative. We provide two solid arguments to support this claim, even though it is possible to think of other ones. It is also necessary to be aware of the fact that our reasoning applies in a different way depending on the particular attitude taken into account. In the next chapters, we will show that some attitudes are more affected by the “heterogeneous effects” bias than others. The starting point of our arguments is that union membership can be seen as an aspect influenced by two life do-mains: the professional and the social sphere.

On the professional side, one important difference between union members and non-members is the economic sector to which they belong. Members belong more often to highly union-ized sectors since becoming a union member is more likely than in sectors with a low union density. Conversely, non-members are disproportionately more likely to belong to lowly unionized sec-tors. It is also reasonable to suppose that the internal dynamics of unions active in highly unionized sectors differ from those in lowly unionized sectors. These internal dynamics can be consid-ered as the key driver of the causal impact on members’ attitudes.

Hence, the differences in the internal dynamics between unions active in different sectors may be responsible for a different causal effect of union membership on attitudes in different sectors. This may happen even if we were able to control for the selection bias related to baseline attitudinal differences between individuals be-longing to different sectors. In other words, if, holding all other factors fixed as in a counterfactual setting, we were able to meas-ure the causal impact of union membership on a particular attitude in two different sectors for the same individual, we may expect

that the causal effect measured in the two sectors may be different because the internal union dynamics leading to the causal impact differ between sectors.

On the social side, we can consider the union membership sta-tus as an indicator of the social involvement of an individual. On average, a union member is an individual more socially involved than a non-member. Because of the differential in social involve-ment, even after controlling again for baseline attitudinal differ-ences between individuals, we can expect that union membership would have a different causal impact on highly socially involved individuals and on lowly socially involved individuals. In fact, we can expect that the effect on lowly socially involved individuals would be greater than the one on highly socially involved individ-uals. If we could bring a lowly socially involved individual to be-come a union member, since the individual is not exposed to many other social influences, we can expect that the social dynamics of union membership to which he would be exposed would reason-ably have a high potential attitudinal impact. Conversely, a highly socially involved individual that becomes union member would be less likely to experience a high attitudinal change because the un-ion social dynamics would represent just one social dimensun-ion added to other ones. For example, the individual may be already affiliated to other voluntary associations.

Which assumptions would guarantee the absence of a bias coming from the heterogeneous causal impact on treatment and control group? We would need to suppose that the average level in the outcome variable for the control and the treatment group are equal for any particular value of the treatment variable. In our case, this means that the average attitudinal level for non-members and members would be the same if we could observe all of them in the member state or in the non-member state (if that is the case, one can easily see that the term related to this bias in equation 3.5 becomes zero). This is called “unit homogeneity assumption”

(King, Keohane, and Verba 1994:91). Unfortunately, as we will see, in most research settings, it is not possible to eliminate this source of bias. In general, under certain assumptions, we will be

able to control for the selection bias and estimate the average treatment on the treatment group, but not the global average treat-ment effect on the whole population since the average treattreat-ment effect on the control group is not going to be estimable. In other words, the estimated causal effect does not concern the average individual in the population under examination, but the average individual in the treatment sub-population. If one thinks about it, this is not such a tragedy. If some individuals would never become union members in the real world, what is the use of estimating the causal effect of union membership on them? Estimating the causal effect on those that actually become members can be seen as more than enough (Winship and Morgan 1999).

However, there is a more subtle point to underline for union membership. It is difficult to predict which kind of individuals would never become union members. Working with data for a particular population on a certain time range, the researcher has only access to the individuals that are union members during that time range in that particular population. For example, a researcher may have access to cross-sectional data for a given country on a particular year or to panel data for a given country on a certain time range. These individuals are those that compose the treat-ment group. Hence, the “treattreat-ment effect on the treated” is esti-mated on them. But if the unit homogeneity assumption does not hold, one has to be aware of the fact that the measured treatment effect on the treated is going to be dependent upon the particular composition of the individuals that are members in the population taken into account during the time range considered. This is a problem of external validity. For example, if it turns out that union members in Switzerland in 2000 come mostly from one sector and in 2008 from another sector, assuming a heterogeneous causal ef-fect of union membership on attitudes between these two sectors, the results would show that union membership has a different im-pact in 2000 and 2008 because the treatment effect on the treated is measured on two different treatment groups. In the next chap-ters, we will see that this source of bias is almost never taken into account by researchers, working implicitly with the assumption that the causal impact of union membership is homogeneous on

the observed union members and on the observed non-members.

We will come back to the composition issue in sub-sections 3.4.4 and 3.4.5, where we specify on which treatment group our results are applicable. We will also show how it is possible to account at least partially for the heterogeneous effects bias by making causal inferences on distinct treatment groups.

3.1.4RANDOMIZATION WOULD SOLVE EVERYTHING...

Thinking of causality counterfactually is very useful on a theoreti-cal level, but we need a strategy that would allow us to estimate, to empirically measure the causal effect from actual data on union members. Going back in time not being possible yet for any re-searcher, the methodological cornerstone for assessing causality is the use of a randomized experiment. Randomization solves both sources of bias that are generated by using observed parameters to estimate the average treatment effect. Because in randomized experiments the treatment is assigned randomly, the correlation between treatment and outcome variable is excluded and the con-ditional independence assumption holds. Because the treatment is assigned randomly, the treatment and the control group are equiv-alent and the unit homogeneity assumption holds too. Randomi-zation solves everything. The only gigantic problem is that it is very rare for a social scientist to have the possibility to actually conduct a randomized experiment.

What does an ideal randomized experiment look like in our case? We would like to draw a representative sample of individuals not yet members from the Swiss population, measure the attitudi-nal level at a given point time for all the individuals in the sample, assign randomly half of them to the treatment group (making them become union members) and the other half to the control group (leaving them non-members). This experiment is obviously impossible to conduct. Closed shop or compulsory union mem-bership that appeared in the past for some sectors in some coun-tries may offer something that could be used to imitate a

random-closed-shops are forbidden in Switzerland since 1925 and since the freedom of union membership and also of non-membership are constitutionally granted in Switzerland (Ebbinghaus 2000), these options are not available in our case.

Even if it is not possible to conduct a randomized experiment, it is still very useful to keep in mind the features such an ideal situation has. Why? Because a randomized experiment is always the benchmark used to infer causality in a non-experimental set-ting (Angrist and Pischke 2009). In other words, even though we cannot run an experiment, we will design a “quasi-experiment”

that allows us to make statements about the causal relationship between union membership and attitudes using observational data. Observational data, as opposed to experimental data, are

that allows us to make statements about the causal relationship between union membership and attitudes using observational data. Observational data, as opposed to experimental data, are