• Aucun résultat trouvé

Non-random panel attrition : comparison of two alternative estimations

N/A
N/A
Protected

Academic year: 2021

Partager "Non-random panel attrition : comparison of two alternative estimations"

Copied!
40
0
0

Texte intégral

(1)

Non-random panel attrition: comparison of two

alternative estimations

Mémoire

Patrice Vachon

Maîtrise en économique

Maître ès arts (M.A.)

(2)
(3)

Résumé

Les immigrants forment une partie grandissante de la population dans les pays d’occident. En ce sens, il est de plus en plus pertinent d’étudier leurs conditions de vie et leur intégration. Pour ce faire, au Canada, le gouvernement fédéral a mis en place deux enquêtes : Enquête longitudinale auprès des immigrants du Canada (ELIC) et la Base de données longitudinales sur les immigrants (BDIM). Comme les bases de données longitudinales d’immigrants ont possiblement une attrition non-aléatoire supérieure aux natifs, il est donc encore plus pertinent de venir corriger le biais de sélection potentiel que cela peut occasionnés. Ce mémoire teste deux techniques de correction, soit l’application de poids statistiques inversés et un modèle de correction non-pondéré à trois équations. Dans les deux cas, nous corrigeons la participation au marché du travail et le revenu de travail, mais seul le modèle à trois équations corrige la non-participation au panel. Nous constatons que la correction non-pondérée apporte une meilleure correction du biais sur la participation au marché du travail, mais aucune technique s’est avérée meilleure dans la correction de l’équation de revenu.

(4)
(5)

Abstract

Immigrants are a growing part of the population in western countries. In this sense, it is more relevant to study their conditions and their integrations. To do so, in Canada, the federal government has developed two datasets: Longitudinal Survey of Immigrants to Canada (LSIC) and the Longitudinal Immigration Database (IMDB). In immigrant panels the non-random attrition is likely higher than in native one. Thus, it is relevant to correct this bias. In this paper, we test two correction techniques. We compare statistical weighting and a three equations unweighted correction. In those two cases, we are correcting for participation in the labor market, but only the unweighted procedure corrects for the participation in the panel. We found that the unweighted has a better correction on the bias on the labor market participation, but no estimator has correctly corrected the wage rates.

(6)
(7)

Contents

Résumé iii Abstract v Contents vii List of Tables ix Remerciements xi 1 Introduction 1 1.1 Immigrants’ situation . . . 1

1.2 Reasons explaining the gap . . . 2

1.3 Leavers’ profile . . . 2

1.4 Impacts on the natives’ economy . . . 3

1.5 Methodological problems. . . 3

2 Background and data 5 2.1 Background . . . 5 2.2 LSIC database . . . 5 2.3 IMDB database . . . 6 2.4 LSIC’s weighting . . . 6 3 Models 9 3.1 Bellemare’s model . . . 9 3.2 Weighted model. . . 11

3.3 Comparison of two alternative estimations . . . 11

4 Results 13 4.1 Variables . . . 13

4.2 Variance-covariance structure . . . 14

4.3 Earnings/participation equations . . . 14

4.4 Outmigration equation . . . 17

4.5 Differences between evaluations . . . 17

5 Conclusion 21

A Comparison of the two immigrants point systems in Canada and in Quebec

(8)

B Comparison of variable means in the LSIC and the IMDB 25

Bibliography 27

(9)

List of Tables

4.1 Estimation results obtained with different regression techniques . . . 15

4.2 Differences between the three techniques applied to the LSIC and the IMDB un-weighted estimation . . . 19

A.1 Point system in Canada except for the province of Quebec in 2013 . . . 23

A.2 Point system in the province of Quebec in 2013 . . . 24

(10)
(11)

Remerciements

Je tiens à souligner l’apport de plusieurs personnes sans qui ce projet n’aurait pas vu le jour. Je souhaite d’abord remercier mon directeur, Charles Bellemare et mon co-directeur, Guy Lacroix. Je les remercie pour leur confiance et leur appui n’ont jamais manqué et qui se sont révélés essentiels. Leurs disponibilités à mon égard a été exemplaire.

Je tiens également à remercier Julien Gaudreau, Maria Adelaida Lopera, Nicolas-James Clavet, Steeve Marchand, Luc Bissonette, David Gendron, Martin Gravel et Sébastien Gagnon pour leur aide précieuse. Je réserve mes derniers remerciements à ma famille, mes amis et ma compagne qui sont demeurés d’un grand support tout au long de mon parcours.

(12)
(13)

Chapter 1

Introduction

Panel data pose fundamental methodological problems. In most cases, a significant proportion of individuals leave the panel over time (Wooldridge, 2010). The data used to analyze the dynamics of the integration of immigrants are also more likely to be contaminated by non-random attrition bias (also called selection bias). This bias is even more present when using panel data on immigration because immigrants leave the host country (and the panel at the same time) in a much larger proportion than natives. The attrition rate seems to vary from one country to another. At least one third of the newcomers moving to the United States will leave within the first decades (Warrent and Peck, 1980), more than half of the immigrants of the United Kingdom will leave after six years (Dustmann and Weiss, 2007), and one quarter of them will leave Sweden after five years (Edin, Lalonde and Aslund, 2000). Therefore, there is higher attrition in panels among immigrants than natives. Does higher attrition lead to a bias? If so, does it lead to a positive or negative bias on the wages and employment probability for the remaining immigrants? Do we overestimate their wages and employment because many of the unemployed or low-waged immigrants left the country? Many studies have examined the selection bias. In most cases, the remaining immigrants in the panel have been positively selected. This has been tested in several ways and different western countries. Edin (2000) found a positive bias for Sweden, Bellemare (2003) for West Germany, and Picot and Piraino (2012) for Canada. In most studies, it is always the least skilled and least economically integrated immigrants who leave the panel.

1.1

Immigrants’ situation

Before the 1980’s, the economic integration of immigrants was not as much of an issue as it is today. Indeed, Borjas(2013), using decennial 1970-2010 census data, show that immigrants from the pre-1980 cohorts were earning even more than the natives after their economic inte-gration was successfully completed. The immigrants from the 1965-1969 cohort were entering the labor market with wages 24 % lower than those of the natives. After 20 years, they were

(14)

reaching the natives’ wages, and after 40 years, they were earning 20% more than the locals. Today, the immigrants are no longer reaching the locals’ wages. They do not even reach the average wage of the natives of the same ethnic group (Borjas, 1994). Borjas (2013) found that the 1985-1989 cohort entered the labor market with wages 33% lower than those of the na-tives. In the late 1990’s, the situation seems to be improving. Although it is still far from the previous results, the 1995-1999 cohort entered the labor market with a 27% wage difference, but after 10 years, the difference is still 27% . Finally, the small increase of the late 1990’s have been reversed because the newcomers from the 2005-2009 are entering the labor market with, again, a 33% difference.

1.2

Reasons explaining the gap

Coulombe (2012) explains the gap between natives and immigrants using the “quality” of the human capital. In this sense, he used as quality measurement the capital truly invested in the education of newcomers and the level of physical capital used while they acquired their work experience. These nearly explain the increasing gap between natives and immigrants in its entirety. This means that the wage gap between natives and immigrants is mainly explained by the education and experience quality. Chiswick (2008) showed that the return on education is lower for immigrants. This is due to poor matching in the labor market. Chiswick found that two thirds of the smaller education payoff is due to poor matching and that the last third is largely due to the limited transferability of human capital.

1.3

Leavers’ profile

Edin et al. (2000) and Dustmann et al. (2007) focus on “outmigrants” profile. Newcomers from OECD countries have a larger propensity to leave. In particular, young immigrants are also more likely to leave. Edin found that more than half of Sweden’s Scandinavian newcomers leave. It is about 30% for OECD immigrants and 9% for non-OECD immigrants. Dustmann et al. found similar results: the return migration is particularly high with immigrants from the European Union, America and Australia/New Zealand compared to immigrants from the Indian subcontinent and Africa. Dustmann et al. argued that there are two types of migration: temporary and permanent. These reasons lead us to reject the hypothesis that immigration is permanent. In fact, most of the immigration in the UK is temporary and can be subclassified as circulatory migration, transient migration and contract migration. Circulatory migration is due to a seasonal labor demand. Transient migration is composed of immigrants who move to several countries before reaching their final destination. Finally, contract migration is a temporary migration determined by a working contract or a limited residence permit. With the simplest theoretical model, he showed that there are three main reasons for doing a return migration: a preference for consuming local products, a higher purchasing power in their

(15)

country of origin, and the accumulation of human capital in the host country, which leads to a higher productivity at home.

1.4

Impacts on the natives’ economy

This leads to another problem: what is the effect of immigration on the locals? Are immi-grants affecting natives’ wages or job opportunities? Are they using more public services? Card (1990) used data from the city of Miami when numerous Cubans immigrated in a short period after Castro allowed them to leave for the US. This massive exodus is called the "Mariel boatlift". At that point, the immigrants accounted for 7% of Miami’s population. He did not notice any effect on the natives’ wages or unemployment rate. This is also true for natives of the same ethnic group. The only reason why the average wages decrease and the unemploy-ment rate increased is that newcomers themselves have lower wages and are more likely to be unemployed. Hunt (1992) founds similar results with important immigration data obtained mainly in the south of France after the independence of Algeria. Indeed, 900,000 immigrants moved to France in 1962, which represented 1.6% of the total French labor force. What is interesting about Hunt’s research is that there was no selection bias among immigrants who came to France: a strong majority of French decided to leave Algeria following its indepen-dence. Borjas, Grogger and Hanson (2011) also found that immigration has only a small impact on natives’ wage. However, when we subdivise the population in 32 subgroup classifi-cations, there is a significant impact on the unemployment rate and on the wage as well. It means that a specific group of immigrants (e.g.: high-school graduates full-time workers with 5 years experience) can have an impact on the same group of native. Borjas et al. (2011) shown that US immigrants have a larger proportion of high-school dropouts and of college graduates. Therefore, those two native groups are more affected by immigration. Lacroix, Santarossa and Gagné (2003) showed that the most recent cohort of immigrants had a larger propensity to be dependent upon social welfare in Quebec, regardless of the economic condi-tions, as opposed to natives, whose propensity is largely correlated with the economic status. The new cohort of immigrants may have a higher probability to depend on social welfare, but there is a considerable variation between ethnic groups (Borjas, 1994), family statuses and gender (Lacroix et al., 2003).

1.5

Methodological problems

Picot et al. (2012) used the Canadian Longitudinal Immigration Database (IMDB) and found that the remaining immigrants in a panel are positively selected as well. However, he showed that this bias does not exist when we consider only the gap between immigrants and natives. They used three different types of databases to compare the attrition bias: a cross-section census type, a cross-section panel database and a longitudinal database. Picot et al. (2012)

(16)

found that the difference in the propensity to leave the panel between the richest and poorest was similar for the immigrants and the natives. In other words, although immigrants have a higher propensity to leave the panel regardless of their income, the selection bias is the same for immigrants and natives because the poorest natives are more likely to leave the panel, which means that the bias is mostly the same for foreigners than for locals. According to Picot et al., the census data can be used to measure the wage gap between natives and newcomers. Yet, there is always attrition bias when using the evolution of a cohort over time. Green and Worswick (2012) have showed that since the 1980’s, the immigrants’ foreign experience does not provide a payoff on the labor market anymore. That mean that a young non-experienced immigrant receives mostly the same wage as an experienced immigrant. The decline of the experience payoff occured between 1980 and 1990. Green et al. argued that we need to compare the immigrants’ wages with new entrants on the labor market. The gap between immigrants and natives no exist when we compare immigrant new entrants to native new entrants on the labor market.

(17)

Chapter 2

Background and data

2.1

Background

Immigration to Canada is based upon a point system composed of six criteria: official lan-guage abilities, age, education, work experience, employment already arranged in Canada, and adaptability. Immigration is now mostly controlled by the federal government, except for economic immigration in Quebec. Quebec’s provincial Skilled Worker Program is also a point program, but with different criteria and weighting (see appendix A for details). In Canada, during the 1960s, more than 90% of the immigrants were of European origin. Among immi-grants who landed between 2001 and 2006, 58% were born in an Asian countries. In 2006, 19.8% of the Canadian population was born in another country, which represents one of the highest rates in the world, as well as the highest rate in Canada since 1931. At that time, 22.2% of the population was foreign-born. Immigrants live mostly in cities: 93.7% of them were established in urban regions, as opposed to 67.5% for the whole Canadian population. With such high rates, the federal government created two databases in order to study this new phenomenon. These databases are the Longitudinal Survey of Immigrants to Canada (LSIC) and the Longitudinal Immigration Database (IMDB). Both will be used in this paper.

2.2

LSIC database

LSIC is essentially a panel focusing on immigrant’s integration. This database takes into account the first four years after the immigrant’s arrival. This period is considered crucial for a successful integration. LSIC considers only one cohort: immigrants who are at least 15 years old at the time of their arrival, which arrived between October 1st, 2000 and September 30th, 2001. There are three waves in this panel: 2001, 2003 and 2005. In 2001, the cross-section had 12,040 observations. This paper only considers the male immigrants of the skilled workers category aged between 25 and 59. We are only concerned by males because female participation may vary with culture. We are only interested in immigrants of the skilled

(18)

workers category because refugees and immigrants from other categories might not be willingly immigrating. It can be argued that most of them did not voluntarily make the choice to immigrate. Furthermore, we exclude immigrants that live in the Atlantic provinces and in the territories because there are too few of them and those immigrants might have special characteristics. Finally, we only choose the immigrants aged between 25 and 59 because our focus is on work. After sorting the data, there are 3,150 observations left.

As many as 24.6 % of the immigrants leave the panel between the first and the second waves and 17.6 % between the second and the third waves. This gives us an attrition rate of 37.9% between 2001 and 2005.

2.3

IMDB database

IMDB is a good source of reliable information, as it include personal data from tax returns and the landing file of each immigrant. The landing file contains information needed for the point system. The data are useful for the study of outmigration because completing the tax return and the landing files is mandatory. The database contains information on every immigrant since 1980. Just as with the LSIC, we restrict our analysis to the males of the skilled workers category from 25 to 59 years old who arrived between October 1st, 2000 and September 30th, 2001. In this sense, we will be able to verify the quality of LSIC data. However, there are several problems with the IMBD.

First, it only contains information on annual income, not wages, nor information on working hours. Furthermore, we do not know if the person returned to school or retired. There are 45,680 immigrants that fit the above-mentioned category. The attrition rate is much lower in this database. The attrition rate from 2001 and 2006 is only 7%.

Such a low attrition rate is surprising and not at par with the literature. Few reasons may explain the gap between this result and the literature. First, the point system in Canada can be an obstacle to the less motivated immigrants. So, only the most motivated immigrants immigrate to Canada. Second, the international mobility of Canadians is reduced (e.g.: com-pared to European countries) by the fact that no labor mobility agreement exists with the US.

2.4

LSIC’s weighting

Statistics Canada recommands using weights with the LSIC database for every observation for each panel wave. The weighting technique is complex. Statcan (2013) summaries it as follows:

The LSIC weighting strategy is based on a series of cascading adjustments. The

(19)

final longitudinal weight is obtained by applying various adjustments to the ini-tial weight. There are four weights involved in the weighting process which will compose the final weight; the initial weight, the non-response adjustment weight, the unresolved adjustment weight and finally the post-stratification weight.

This paper will evaluate the recommendated weighting. Wooldridge (2002) argues in favor of unweighted samples. He recommends using the Heckman’s (1978) correction, which leads to a more efficient estimator. We will consider both strategies and compare the results.

(20)
(21)

Chapter 3

Models

3.1

Bellemare’s model

In this paper, we are using Bellemare’s (2003) outmigration model. Basically, Bellemare mod-eled a three equations model with the wage, the probability of employment and the probability of outmigration. In this paper, we replace the wage by the earnings. The immigrant’s earning, wit, can be expressed using the following equation:

ln wit = x0itβ + ηi1+ 1it (3.1)

The labor market participation propensity, p∗it, can be expressed as a latent unobserved equa-tion:

p∗it= zit0 θ + ηi2+ 2it (3.2) Finally, we can assume that the unobserved propensity of outmigration, r∗it, can be generated by this equation:

r∗it= s0itγ + ηi3+ 3it, (3.3) where xit, zitand sit are vectors of individual characteristics and β, θ and γ are the vectors of

unknown parameters to estimate. We do not observe the propensity of employment or return migration. Thus, we know if the immigrant works (pit = 1) or not (pit = 0) and whether he

has left (rit = 1) or not (rit = 0) the database. We assume that 1it, 2it and 3it are normally

distributed error terms :    1it 2it 3it   ∼ N       0 0 0   ,    σw2 ρ1,2σw ρ1,3σw 1 ρ2,3 1       (3.4) σ2w is the variance of the log earnings. The other parameters, ρ1,2, ρ1,3 and ρ2,3 are correlation coefficients between the error terms.

(22)

η1

i, ηi2 and η3i are the unobservable characteristics of the individuals (also called the random

effects). ηi1 and ηi2 are the individuals’ unobservable abilities on the labor market. ηi3 is the unobservable immigrants’ attachment to their countries. We further assume that ηi1, η2i and η3i are known by the immigrants, but not by the analyst. We assume they are normally distributed:    η1i η2i η3i   ∼ N        0 0 0   ,     ση21 ρ η 1,2ση1ση2 ρη1,3σηη3 σ2η2 ρ η 2,3ση2ση3 ση23         (3.5) σjη2 is the variance of the unobservable characteristics of the three equations. Ceteris paribus, a

positive and significant ρη1,2 means that an immigrant with unobservable characteristics which lead to a higher wage will be more likely to take part in the labor market. ρη1,3 and ρη2,3 are simply the return migration bias. A negative correlation implies that immigrants with unobservable characteristics, which lead to a lower earnings (or participation propensity), are more likely to leave the panel. The sign of those parameters determines the selection bias of the panel attrition.

We do not know with certainty whether an immigrant has really left the country. This is particuliary true when the data has a high rate of attrition. When we have precise data, such as administrative data, we know who has left and who has not. In this case, the maximum likelihood of this model is:

L = X pit=1 rit=1 ln f (wit) Pr(pit= 1, rit = 1|wit) + X pit=1 rit=0 ln f (wit) Pr(pit= 1, rit= 0|wit) + X pit=0 rit=1 ln Pr(pit= 0, rit= 1) + X pit=0 rit=0 ln Pr(pit= 0, rit = 0) ≡Xlnf (wit, pit, rit, η1, η2, η3) (3.6)

To numerically solve this problem, Bellemare suggests approximating the unobservable random effects by simulated means. We need to draw R different triplets [η1, η2, η3] from a trinomial normal distribution with the parameters of the variance-covariance matrix in equation (3.5). N is the number of observations. The simulated maximum likelihood is given by:

1 N N X i=1 ln " 1 R R X r=1 f (η1,r, η2,r, η3,r) # (3.7) If R is low, the result will be inconsistent, but with an infinite R, the parameters estimates are asymptotically consistent. Train (2003) suggests using a number that follows this rule: √

N /R → 0. He also suggests using Halton sequences to reduce the simulation noise.

(23)

3.2

Weighted model

As we mentioned previously, the weighted model is the standard Heckman’s correction applied to the earning equation and the labor market participation. We use the same equation for ln wit and pit, except that we use a 2x2 matrix for the unobservable heterogeneity and the

stochastic shocks matrix. Furthermore, we introduce the Statistics Canada’s weight, κit, in the maximum likelihood:

L = X pit=1 κitln f (wit) Pr(pit= 1|wit) + X pit=0 κitln Pr(pit= 0) (3.8)

We will compare the two models applied to the LSIC database. The estimates of each model will be compared to the estimates obtained with unweighted model applied to the IMDB. Intuitively, if the remaining observations within a specific group have mostly the same charac-teristics as the others within the same group, the weight corrects the selection bias. Moreover, statistical weightings are useful when we have a database with oversampled groups.

3.3

Comparison of two alternative estimations

It is relevant to compare both model to test if it is consistent to use statical weight. Does STATCAN imposes a good choice when they force people to use statistical weight? It is true that statistical weightings can be used to tackle some confidentiality concerns especially for cross tabulation. This plea is irrelevant when a regression is used. Both model are likely to generate errors transmission between equations. This problem can be worse in the three equations model because the correction of the bias is linked with the other equations.

Finally, we use standard probit/OLS model as a benchmark. This model without correction can help us note the bias amplitude. Additionally, this model does not have the exclusion restriction problems and the error transmission between equation issue.

(24)
(25)

Chapter 4

Results

4.1

Variables

We are using standard variables to model the earnings and participation equations. We use the birth region, arranged employment, a university diploma dummy and age as control vari-ables. Provincial dummy variables are also included. The Quebec provincial dummy captures the effect of having a different point system. We capture global macroeconomic fluctuations by adding standard time dummies in the model. Furthermore, because the immigrants in our panel nearly all arrived at the same time, the time dummies also control for the immigrants’ economic integration. The number of months before the first interview (or the first tax return) is used as an exclusion restriction in the participation equation. This variable is only used for the first wave because the temporal dummies catch the integration effect in the following waves. This exclusion restriction is affecting the participation equation because the time since the arrival makes the probability of working higher. Furthermore, this variable is totally ex-ogenous to the model, which a prerequisite for a good restriction. It is specially important in the few months after arrival because it is the time the immigrant develop his social network. We do not include experience because the variable is missing from the IMDB database. This is relatively inconsequential because Green et al. (2011) found that experience does not contribute much to wages. For the outmigration equation, we are using essentially the same variables. Dustmann et al. (2007) has found that the birth region matters a lot on the propensity to leave. We use the province of residence as well. As exclusion restriction, we use the fact the immigrant already made a migration. We derive this variable with two others variables: the immigrant’s last permanent residency country is different from his birth country. This variable is a good exclusion restriction because someone who already did a migration might have more mobility than others. Also, this variable is exogenous to the model. Dustmann et al. (2007) determine that an immigrant might migrate a few times before reaching their final destination. Finally, since the IMDB database has a large number

(26)

of observations, nearly all the estimates are significant.

4.2

Variance-covariance structure

As mentioned previously, the LSIC suffers from large attrition. As the attrition rate is way larger than the IMDB, we can easily suppose that a significant proportion of the LSIC’s leavers are still in the country. Thus, we analyze the IMDB’s results first, then we will compare the two corrections strategies with the LSIC’s results. All the results from section 4.2, 4.3 and 4.4 are reported in table 4.1. In our model, there are two variance-covariance structures: one for stochastic shocks and the other for unobserved time-invariant heterogeneity as shown in the table. We have a significant and negative correlation of -23.9% between wage and partic-ipation error terms. This result is similar to Bellemare’s result. The correlation between the earnings and participation shocks with the probability of leaving are respectively -14.7% and -8.0% respectively.

The other variance-covariance structure is made of unobservable time-invariant heterogeneity correlation coefficients. We can quantify the effect of his unobservable characteristics on earn-ings, on labor market participation and on the propensity to leave the host country. We obtain a correlation of 62.9% between the earnings and the participation, -33.8% between earnings and outmigration and -93.8% between the participation and outmigration unobservable char-acteristics. Those results mean that there is a strong correlation between the unobservable ability that affects the earnings and those who affect the participation propensity. There is also a near-perfect negative correlation between the labor market participation and outmigration unobservables.

4.3

Earnings/participation equations

We use a dummy for university diploma. Immigrants with a university degree represents over 80% in both databases. The parameter is only 0.03 in the earnings equation. Since the earning is expressed in its logarithmic form, a result of 0.03 means only a 3% difference in the earning between university graduates and the others. This might be surprising, but not when we con-sider the point system through which immigrants were selected. This is because we focus only on skilled workers. If an immigrant had zero point in the education criterion (with less than a high school diploma) or five points with only a high school diploma, the candidate needed a good score at most of the remaining criteria (see Appendix A). In addition, the university graduate have less probability to participate in the labor market.

The birth region dummies separate the world in four regions: Europe &North America (the

(27)

Table 4.1: Estimation results obtained with different regression techniques

IMDB LSIC

Variables OLS/Probit Unweighted OLS/Probit Unweighted Weighted Earnings

University diploma 0.04*** 0.03** 0.03 0.04 -0.01 Quebec -0.64*** -0.60*** -0.37*** -0.34*** -0.26*** Prairies 0.12*** 0.14 *** 0.07 0.07 0.17 *** British Columbia -0.13*** -0.15*** -0.18*** -0.16*** -0.19*** Asia & Oceania -0.56*** -0.58*** -0.26*** -0.27*** -0.32*** Africa & Middle east -0.42*** -0.50*** -0.08 -0.08 -0.24*** South America -0.22*** -0.21*** 0.04 0.02 -0.18*** Job at the arrival 0.64*** 0.67*** 0.84*** 0.82*** 0.79***

Age (÷70) -0.53*** -0.92*** -0.25 -0.25 -0.32*** Variance 0.84*** 0.84*** 0.66*** 0.67*** 0.46*** Participation University diploma -0.53*** -0.18*** -0.21*** -0.24** -0.06* Quebec -0.58*** -0.50*** -0.68*** -0.68*** -0.87*** Prairies 0.60*** 0.54*** 0.21** 0.27** -0.26 *** British Columbia -0.39*** -0.36*** -0.26*** -0.29*** -0.34*** Asia & Oceania -0.34*** -0.35*** -0.33*** -0.37*** -0.48*** Africa & Middle east -0.62*** -0.63*** -0.36*** -0.42*** -0.40*** South America 0.09** 0.09** 0.12 0.09 0.46*** Job at the arrival 0.30*** 0.32*** 0.47*** 0.60*** 0.72*** Age (÷70) -2.85*** -2.56*** -1.49 -1.53 -3.56*** Months/first interview (÷12) 1.51*** 1.60*** 0.18 1.05** 1.66*** Outmigration University diploma 0.08*** 0.09*** 1.13** 0.44** -Quebec 0.05*** 0.05*** -0.11* -0.41* -Prairies -0.14*** -0.12*** -0.32*** -1,06*** -British Columbia >-0.01 >-0.01 -0.08 -0.23 -Asia & Oceania -0.11*** -0.14*** 0.10** 0.32* -Africa & Middle east -0.04** -0.05** -0.04 -0.09 -South America -0.08** -0.08** 0.06 0.31 -Job at the arrival 0.11** 0.10* 0.06 0.06 -Age (÷70) -1.18*** -1.26*** -0.73*** -2.48*** -Migration 0.02 -0.03 0.23*** 0.68*** -Correlation ρ1,2 - -0.24*** - -0.32** -0.24*** ρ1,3 - -0.15*** - 0.03 -ρ2,3 - 0.08*** - -0.55*** -ρη1,2 - 0.63*** - 0.34*** 0.24*** ρη1,3 - -0.34*** - -0.18*** -ρη2,3 - -0.94*** - -0.82*** -P-value: ***: < 0.01, **: ]0.01, 0.05], *: ]0.05, 0.10]

(28)

benchmark), Asia (except Middle East) &Oceania and Middle East &Africa and South Amer-ica. In the wage equation, all the dummies are negative and significant. We nearly obtain the same result in the participation equation: all the dummies are negative and significant except for those born in South America. The results are particularly worrying for the dum-mies on wage equation for Asia-Oceania (-0.58) and Africa-Middle-East (-0.50). One of the main reasons explaining this result is that we do not control for the origin of the diploma (the information is not available in the IMDB). Those dummies might capture part of this phenomenon. For Coulombe (2012), the diploma’s origin matters a lot: it nearly explains all the difference in the economic performance between OECD and non-OECD immigrants who immigrate to Canada.

Even if few immigrants have an arranged job at arrival, it seems to have a big effect on the labor market performance. We obtain a positive and significant value in both equations. Even if the labor market participation is obviously higher (0.32), the immigrant with an arranged job also have in average a higher wage (0.67).

We divide Canada into the four regional subdivisions: Quebec, Ontario, Prairies (Manitoba, Saskachewan and Alberta) and British Columbia. Remember, we do not include the Atlantic provinces (Newfoundland, New Brunswick, Nova Scotia and Prince Edward Island) and the territories (Yukon, North-west territory and Nunavut). The benchmark is the most populous province, Ontario. It is also the province who receives the most immigrants (about 60%). All the regional dummies have significant estimates. Quebec and British Columbia have a negative value and the Prairies have a positive value on the wage and on the labor market participation. This is not surprising since Ontario is the most industrious region and the Prairies had a petroleum boom (and a very low unemployment rate) during those years. One of the surprising results is the strength of the estimate for Quebec (-0.60). This is a strong regional difference. Even if the price level is the classic argument to downplay the fact that the natives have lower wages in Quebec, it cannot explain a big gap like this. As we mentioned before, Quebec has a different point system. Such a big gap is likely to be explained by the fact that Quebec’s point system leads to a different type of immigrants who perform less on the labor market compared to the rest of Canada.

In addition, age has a significant and negative value in both equations. This means that the older an immigrant arrives in Canada, the less he will earn and the more likely he is to be unemployed. We can easily suppose that older immigrants have more experience. This find-ing supports our previous claim: foreign experience does not seem to contribute much in the Canadian labor market(Green et al. (2011)). Intuitively, the number of month before the first interview has a positive effect on the probability of working. It shows how important the first

(29)

months after arrival are important.

4.4

Outmigration equation

This equation is at the heart of the paper. Indeed, the unweighted correction strategy is based on this equation. In the IMDB database, we can assume that in most of cases, someone who leaves the panel, also leaves the country. For the outmigration equation, we used essentially the same variables as the earnings equation except that we add a "mobility" dummy. Unfor-tunately, this identifying variable does not have a significant parameter. This issue will be discussed in the conclusion section.

The university diploma dummy has a positive parameter. Intuitively, the university graduates have generally more mobility, thus there are more likely to outmigrate. In addition, all the birth region dummies are negative. It means that immigrants from the benchmark region (Europe/USA) are more likely to outmigrate than the other regions. This is exactly what Dustmann et al. (2007) found: immigrants from the OECD leave in larger proportion. The results for the provincial dummies are positive for Quebec, negative for the Prairies and insignificant for British Columbia. The "job at arrival" dummy has a positive value. This result might seem counter-intuitive. First, the parameter is only significant at a 10% level. Second, immigrants with an arrangement job might have more chance to have a working con-tract. Dustmann et al. (2007) classify contract immigrants as temporary and most of them leave after the contract is over. It might be the most plausible explanation for this estimate. Finally, our continuous variable, age, has a negative estimate. This result means that older immigrants are less likely to migrate again.

4.5

Differences between evaluations

As we have reported before, this paper compares the two strategies of section 3 to correct the partial observability bias. We can argue that the IMDB’s results are less biased. We set those results as the benchmark and we will compare the other techniques estimates with this one. The closest the estimates of the other techniques get to this result, we suppose that the less biased they are. We will compare them to the standard OLS/probit regressions which provide no selection bias correction. We will not compare the constant and the time dummies because the interviews are not at the same time. In table 4.2, we compare the three regression techniques applied to the LSIC with the IMDB unweighted results. We analyze the difference

(30)

in the parameters values of the earnings equation and the participation equation between the three techniques. In addition, we will only compare the outmigration results with the probit results because the weighted model does not have any outmigration equation.

There is not much difference between the three regression techniques in the earnings equation. There are only three insignificant differences on nine estimates in every strategy’s results. We can conclude that there is no big difference between strategies on the earnings equation. There is much more differences in the participation equation. There are eight out of ten insignificant differences on the unweighted model, two out of ten on the weighted model and five out of ten on the standard unweighted probit model. The unweighted three equations model seems to correct most of the bias in the participation equation. The weighted model does no better than the probit model in term of insignificant difference. With those results, we can conclude like Woolridge (2002), the weighted model is not a "panacea".

We can compare the probit model with the three equations model on the outmigration equa-tion. There is not a big difference between the two. There are six insignificant differences in Bellemare’s model and five in the standard probit. We cannot really draw any conclusion about the difference between correlation coefficients. There are only two correlation param-eters that are involved in more than one technique: ρη1,2 and ρ1,2. There are unsignificant differences of ρ1,2 and significant differences of ρη1,2 in both equations.

(31)

Table 4.2: Differences between the three techniques applied to the LSIC and the IMDB un-weighted estimation

Variable Unweighted Weighted OLS/Probit Earning equation

University diploma 0.01 -0.04 <0.01 Quebec -0.26*** 0.34*** -0.22*** Prairies 0.07 0.03 0.07 British Columbia 0.02 -0.04* 0.04

Asia & Oceania -0.31*** 0.26*** -0.32*** Africa & Middle east -0.43*** 0.27*** -0.42*** South America -0.22*** 0.03 -0.24** Job at the arrival -0.15* 0.12* -0.17** Age -0.67*** -0.60*** -0.68** Working equation University diploma 0.06 0.12** 0.03 Quebec 0.18 -0.37*** 0.19* Prairies 0.27** -0.80*** 0.33*** British Columbia -0.07 -0.02* -0.10

Asia & Oceania 0.03 -0.14** -0.01 Africa & Middle east -0.21 0.23*** -0.27**

South America >-0.01 0.37*** -0.03 Job at the arrival -0.29 0.41*** -0.16

Age -1.03*** -1.00*** -1.07*** Months before the first interview 0.55 0.07 1.41***

Outmigration equation

University diploma -0.34 - -0.04 Quebec 0.46** - -0.04 Prairies 0.94*** - 0.20** British Columbia 0.23 - 0.08

Asia & Oceania -0.46*** - -0.24*** Africa & Middle east 0.05 - >-0.01

South America -0.39 - -0.14 Job at the arrival 0.04 - 0.04 Age 1.22 - -0.53* Mobility -0.71*** - -0.26*** Correlation ρ1,2 0.08 <0.01 -ρ1,3 0.11 - -ρ 2,3 0.63*** - -ρη1,2 0.29*** 0.39*** -ρη1,3 -0.16*** - -ρη2,3 -0.12*** - -P-value: ***: < 0.01, **: ]0.01, 0.05], *: ]0.05, 0.10]

(32)
(33)

Chapter 5

Conclusion

Immigration is an important issue in most Western countries. This is why it is more relevant than ever to have appropriate data to study this group. It is essential to know their real situation, if they work, if they are economically integrating to the host society. That is why the immigration panel information will gain an increasing importance. In every panel, we might face non-random attrition which will create a bias.

In this paper, we compare a weighted correction model and Bellemare’s model, which is a (unweighted) three equations model. Our results show that the unweighted model leads to a better correction of the non-random attrition bias. In this paper, we have shown that no technique seems to perform better to correct the earnings equation. It is not the case in the labor market participation equation. The unweighted model succeeds the most in correcting the bias in the participation propensity.

The unweighted model is even more relevant because most of the time we did not have any weight. So, we do not need to have any information about the proportion of each subgroup in the population. This is the biggest advantage of three equations model. In future research, it might be relevant to test it with a non-immigrant panel. In this case, we will be able to relax the hypothesis that the subjects are leaving the country. We might have different results in this case. Furthermore, it can be relevant to remove the normal distribution’s hypothesis of the error terms and the random effects.

Our results suggest that our model perform better than the weighted regression imposed by Statistics Canada. Even more, the standard probit seems to perform better than the weighted model. Perhaps the weighting can be useful for table and averaging statistics, but it does not seem to perform very well on multivariate regression with non-random attrition. It is par-ticularly problematic because Statistics Canada impose weighted regression to their database

(34)

users. For our purpose, we needed an exemption to use the unweighted results.

The exclusion restriction on the outmigration equation is significant in the comparison regres-sions, but not on the "benchmark" regression. A weak exclusion restriction might be problem-atic. This phenomenon could lead to identification and collinearity problems (Puhani(2000)) in our main regression. The large sample used in the main regression might decrease the gravity of the problem. This problem needs to be investigated especially in the case of a three equation model with a large sample.

One of the surprising results that might need more explanation and more research is the Quebec earnings gap. It could be interesting to introduce the linguistic factor. The different point system can be tested to see which criteria are more relevant to select immigrants. The point system changes every few years and the question is more important than ever. It can be relevant to try the same regression with every change in the points system to see how each cohort evolves in the Canadian labor market. It will be relevant to see how every change in point system changes the immigrants’ type and their skills to perform on the Canadian labor market.

(35)

Appendix A

Comparison of the two immigrants

point systems in Canada and in

Quebec in 2013

The two point systems are similar, except that the Quebec one is a little bit more complex. The Canadian one has language skills, education, experience, age, arranged employment and adaptability which are the characteristics that give points. The Quebec’s one has the same bases, but it examines the composition of the immigrants’ household. There are points for the spouse’s characteristics and for children. Furthermore, there is a financial capacities evaluation which leads to an automatic disqualification if failed. Finally, the adaptability is evaluated through an exam in the Quebec’s point system.

Table A.1: Point system in Canada except for the province of Quebec in 2013 Characteristics and points of the immigration’s points system in Canada

Caractheristics Description Number of points Total max. English/French Abitity to Up to 6 pts per categories 28

skills listen/speak/read/write plus 4 for another official language

Education Level of education 25 pts for Ph.D. 23 for a master 25 achieved 21 for a bachelor 5 for high school

Experience How many years of 9 pts for 1 year, 11 for 2-3 years, 15 experience 13 for 4-5 years and 15 for 6+

Age Points based on your age 0 pts for 18-, 12 for 18-35 12 then [12 − (age − 35)] pts, min : 0

Arranged If you have a arranged 10 pts if you have a job 10 employment job at your arrival 0 either

Adaptability Other characteristics 10 pts for working experience in Canada, 10 that ease integration 5 for spouse language level, etc.

(36)

Table A.2: Point system in the province of Quebec in 2013

Characteristics and points of the immigration’s points system in Quebec

Caractheristics Descriptions Number of points Total max. French/English Abitity to Up to 16 pts for French 22

skills listen/speak/read/write plus 6 pts for English

Education Level of education Up to 12 pts for the diploma’s 28 achieved level and up to 16 for the domain

Experience How many years of 4 pts for 6-23 months 6 for 24-47 8 experience months and 8 for 48+ months

Age Points base on your age 16 for 18-35 then 16 [16 − 2 ∗ (age − 35)] pts, min : 0

Arranged If you have a job 10 pts if you have a job 10 employment at your arrival 0 either

Past experience If you have ever worked 5 pts if studies or worked before 8 in Quebec in Quebec before 3 pts for families

Spouse’s caracteristics 3 pts for education, 4 for the domain 16 (if married) 3 for the age, 6 for the French/English skills

Employability pass mark (single) 42/92 (married) 50/108 Children 0-12 years old 4 pts per children 8

13-21 years old 2 pts per children

Financial capacities (automatically discallified if not) 1 Pre-exam pass mark (single) 49/101

(married) 57/117 Adaptability exam Evalution of the immigrants’ abilities to be easily integrated 6

Final pass mark (single) 55/107 (married) 63/123

(37)

Appendix B

Comparison of variable means in the

LSIC and the IMDB

This table is essential to verify if our results seems to be right. Of course, the three equations model is way more complex that means calculation, but it is important to verify the selec-tion bias impact on the LSIC database. As we explained in Chapter 2, we can easily suppose that IMDB means are closer to reality. STATCAN only offer access the LSIC weighted means. First, the Quebec immigrants seems to be oversampled in the LSIC. This is a little bit awk-ward since one of the main gold of the weighting is to correct for regional oversampling. The prairies’ economic boom seems to attract lot of immigrants since in both database the pro-portion of immigrants who live in those provinces is growing.

Immigrants from the Western countries and from Africa-Middle-East seems to be a little bit under-sampled in the LSIC and the Asiatic ones are oversampled.

With no surprise, the proportion of immigrant who work grow in time and the attrition is larger in the LSIC database.

The wages seems to be a little bit overestimated in the LSIC for the immigrants who live in Quebec and in Ontario and under-estimated for those who live in British Colombia and in the Prairies.

(38)

T able B.1: V ariab le in eac h panel in ev ery w a v e IMDB LSIC V ariables T=1 T=2 T=3 T=4 T=5 T=1 T=2 T=3 Queb ec 16,74% 16,96% 17,03% 16,83% 16,25% 18,30% 19,56% 18,67% On tario 62,10% 61,58% 61,15% 60,99% 60,94% 60,89% 58,08% 58,43% Prairies 7,26% 7,79% 8,15% 8,38% 8,89% 7,51% 8,79% 9,66% British Colum bia 13,91% 13,67% 13,67% 13,80% 13,92% 13,31% 13,56% 13,24% Univ ersit y diploma 80,91% 80,88% 80,84% 80,82% 80,77% 86,46% 85,74% 84,97% Job at arriv al 1,41% 1,42% 1,42% 1,41% 1,41% 10,18% 10,22% 10,07% Mobilit y 13,44% 13,47% 13,49% 13,47% 13,37% 18,24% 16,47% 15,90% W orld region USA-Europ e 19,20% 19,12% 19,00% 18,92% 18,84% 17,34% 17,49% 17,40% Asia-Océania 56,61% 56,61% 56,63% 56,68% 56,91% 65,16% 64,53% 64,71% Africa-Middle-E a st 19,58% 19,66% 19,73% 19,75% 19,61% 13,43% 13,84% 13,77% South America 4,61% 4,62% 4,61% 4,62% 4,63% 4,08% 4,15% 4,12% W ork ed last y ear 60,95% 71,03% 71,54% 72,74% 73,35% 48,40% 74,22% 78,93% W ork – Queb ec 49,84% 60,17% 63,62% 68,57% 70,92% 35,68% 59,60% 69,87% W ork – On tario 63,78% 73,91% 73,87% 74,06% 74,02% 51,16% 78,43% 82,54% W ork – Praries 75,72% 84,47% 83,56% 85,11% 85,88% 62,46% 82,21% 83,32% W ork – BC 53,97% 63,96% 63,92% 64,51% 65,21% 45,33% 72,09% 72,56% Left the panel 1,34% 1,49% 1,62% 2,37% 3,22% 24,58% 17,58% N/D Quit – Queb ec 1,50% 1,77% 1,92% 3,47% 3,90% 18,17% 19,00% N/D Quit – On tario 1,34% 1,48% 1,58% 2,18% 3,27% 27,48% 18,35% N/D Quit – Prairies 1,06% 1,00% 1,24% 1,78% 2,24% 17,98% 10,59% N/D Quit – BC 1,34% 1,46% 1,65% 2,24% 2,86% 23,87% 16,77% N/D A v erage w a g e 17181 27231 31642 36154 40971 12220 34882 42212 W age – Queb ec 15864 22048 25086 28024 31641 10998 31286 36048 W age – On tario 16797 27719 32554 37638 42212 12249 35812 43499 W age – Prairie 21978 33623 37417 43378 51118 15177 40895 49905 W age – BC 17160 25928 30552 33373 38108 11097 30393 37670 26

(39)

Bibliography

[1] Bellemare, C. 2004. Identification and estimation of economic models of outmigration using panel attrition. IZA, Discussion Paper No. 1065.

[2] Borjas, G. J. 1994. The economics of immigration. Journal of Economic Literature, Amer-ican Economic Association, vol. 32(4), 1667-1717.

[3] Borjas, G. J. 2000. The economic progress of immigrants, Economics of Immigration, Na-tional Bureau of Economic Research,15-50

[4] Borjas, G. J., Grogger J., & Hanson, G.H. 2011. Substitution between immigrants, natives, and skills groups. NBER Working papers 17461, National Bureau of Economic Research. [5] Borjas, G. J. 2013. The slowdown in the economic assimilation of immigrants : aging and

cohort effects revisited again. NBER Working papers 19116, National Bureau of Economic Research.

[6] Card, D. 1990. The impact of the Mariel boatlift on the Miami labor market. Industrial and Labor Relations Review, vol. 43(2), 245-257

[7] Chiswick, B. R., & Miller, P. W. 2008. Why is the payoff to schooling smaller for immi-grants?. Labour Economics, vol. 15(6), 1317-1340

[8] Coulombe S., Grenier G., & Nadeau S. 2012. Human Capital Quality and the Immigrant Wage Gap. Working Papers 1212E, University of Ottawa, Department of Economics. [9] Dustmann, C., & Weiss, Y. 2007. Return Migration: Theory and Empirical Evidence.

CReAM Discussion Paper Series 0702.

[10] Edin, P.-A., LaLonde, R. J., & Åslund, O. 2000. Emigration of Immigrants and Measures of Immigrant Assimilation: Evidence from Sweden. Working Paper Series 2000:13, Uppsala University, Department of Economics.

[11] Green, D. A. & Worswick, C. 2012. Immigrant earnings profiles in the presence of human capital investment: Measuring cohort and macro effects. Labour Economics, Elsevier, vol. 19(2). 241-259.

(40)

[12] Hunt, J. 1992. The impact of the 1962 repatriates from Algeria on the French labor market. Industrial and Labor Relations Review. vol. 45(3). 556-572

[13] Lacroix, G., Santarossa, G., & Gagné, P. 2003. Une analyse de la dynamique de la dépen-dance à l’assistance-emploi des populations natives et immigrantes québécoises. CIRANO Project Report 2003rp-14.

[14] Picot, G., & Piraino, P. 2012. Immigrant Earnings Growth: Selection Bias or Real Progress? Statistique Canada, Catalogue 11F0019M-340

[15] Puhani, Patrick A. 2000. The Heckman Correction for Sample Selection and Its Critique Journal of economic survey, Vol. 14(1)

[16] Statistics Canada, 2013, Microdata user guide, Longitudinal survey of immigrants to Canada, wave 3.The guide on Statcan Website

[17] Warrent, R., & Peck, J. M. 1980. Foreign-Born Emigration from the United States: 1960–1970. Demography, Feb. 1980, 17(1), 71–84.

[18] Wooldridge, J. M. 2002. Inverse probability weighted M-estimators for sample selection, Portuguese Economic Journal, August 2002, vol. 1, Issue 2, pp 117-139

[19] Wooldridge, J. M. 2010. Econometric Analysis of Cross Section and Panel Data MIT Press Books. The MIT Press, edition 2, vol. 1.

Figure

Table 4.1: Estimation results obtained with different regression techniques
Table 4.2: Differences between the three techniques applied to the LSIC and the IMDB un- un-weighted estimation
Table A.1: Point system in Canada except for the province of Quebec in 2013 Characteristics and points of the immigration’s points system in Canada
Table A.2: Point system in the province of Quebec in 2013

Références

Documents relatifs

The results of the study proved on statistical significant level (Pearson Chi-Square p=,003 in the table 5) the group of patients from forensic in-patients’ sample differ from

White LZ focused on the asymmetric functional role of pesticides as a source of bias is estimates , our focus is on the possibility that over-estimation of

Figure 1 depicts the empirical hazard function based on the Kaplan-Meier estimators. Panel A shows the hazard function for the foreign and U.S-born individuals. The general pattern

migrants born in the South who live in the North (89 million in 2017, according to the United Nations); South-South migrants (97 million), who have migrated from one Southern country

Initial location choices are analysed for immigrants categorised according to country of origin and migration motive (namely labour, family, and asylum migration) by

In this latter view, the important physics is a Mott type transition with lattice parameter involving not an occupation number change but rather a change in the

Then, there are two natural ways to define a uniform infinite quadrangulation of the sphere: one as the local limit of uniform finite quadrangulations as their size goes to

In Spain, the proportion of immigrants (13 per 1,000 in 2015) is roughly the same as that of the United States and France, but unlike these two countries, its immigrant