• Aucun résultat trouvé

Application: Acid Neutralizing Capacity of the lakes in northeastern USA . 67

The application of Lake data has been the subject of studies in small area estimation.

In this section we use the synthetic population of the lakes in northeastern USA that is used by Salvati et al. [2012] for a design based simulation, refer to this paper for further explanation of how this synthetic data has been made from the data provided by the US Environmental Protection Agency’s Environmental Monitoring and Assessment Program (EMAP), between 1991 and 1995. In this application the interest is in the average of the Acid Neutralizing Capacity of the lake that is a linear parameter of the defined areas, therefore the advantage of our proposed methods for calibrating the bias of the non-linear parameters of interest will be partially masked. Nevertheless, we believe it is useful to compare the performance of our asymmetric bias calibration (REBLUP-ABC) method with conventional symmetric calibration (REBLUP-SBC), REBLUP and EBLUP for the estimation of the mean of Acid Neutralizing Capacity (ANC).

In this setting from the population of 21026 lakes that are grouped in 113 HUCs(domains) a sample size of 652 lakes from 86 HUCs is drawn, by using simple random sampling without replacement (SRSWOR), and no sample from the remaining 27 domains. This sampling process is repeated 200 times10. Knowing the true values of the ANC mean for each HUCs, at each round of the sampling we can calculate the errors in the estimation of the ANC means.

The Table 3.2 presents the distribution of the Relative Bias and RRMSE for different methods across the sampled areas. The partial calibration refers to using each area specific residuals to calibrate the estimation for that area, whereas the full calibration uses the entire vector of the model residuals, to correct for the bias. Using the partial calibration there is a marginal gain in performance of REBLUP-ABC compare to the other two methods as the optimum tuning constants for the other two methods is a subset of the possible tuning constants for this method. For the areas, that are not presented in the sample, we use the fitted model to predict the fixed part with the design matrix of the auxiliary variables, in this case elevation of each lake (ELEV). Then we use the median of the predicted random effects, as area specific effect for these areas. The calibration for the non-sample areas is only done by full calibration. We have also presented, in the Table 3.2, the full calibration for the sampled areas. The results are very interesting, since we can observe that full calibration can improve a lot the Bias and the RRMSE for

10We use the exact samples that are used by Salvati et al. [2012]. We thank the authors for sharing the ID of the sampled lakes in their analysis.

Figure 3.3: Partial Calibration

sampled and non-sampled areas when using the asymmetric calibration. The reason can lay in the fact that, by using the full calibration, we are offering a more stable or uniform calibration, whereas the form of the skewed Huber function with 2 tuning constants still provides enough flexibilities in the shape of the predictive distribution in each area. We have illustrated the results of the partial calibration and full calibration, for each area in Figures 3.6 and 3.6, respectively.

3.7 Conclusion of the chapter

In this chapter we have introduced new bias calibration approaches for the robust estima-tion of the non-linear small area parameters. We adopt two different methods to deal with the area specific non-linear parameter. First method uses the estimate of the area specific empirical CDF and derives from it the non-linear statistical functional of interest. The second method linearizes the non-linear parameter by using the von Mises approximation.

The robust estimates obtained from these methods are biased and therefore, we propose a new bias calibration approach to account for the this. We introduce an asymmetric version of the Huber function with two tuning parameters that encompass the original Huber function when the (second) skewness parameter is set to 1. In all cases this method outperforms the conventional symmetric calibration approach as the symmetric choice is included in the solution set while searching for the optimum tunning parameters. In addi-tion in the cases when the outcome has a heavy tailed and highly skewed distribuaddi-tion, the asymmetric calibration results in more efficiency gain. We illustrated these advantages in series of realistic simulations in this chapter. Using the asymmetric bias calibration on the linearized version of the parameter (IF-SBC) results in a estimator that outperforms

Table 3.2: Design-based simulation results using the EMAP data. Summary of Relative Bias (RB) and Relative Root Squared Error (RRMSE) over areas.

Summary across areas

Predictor Indicator Min Q1 Median Mean Q3 Max

86 sampled HUCs-Partial calibration

EBLUP RB(%) -11.67 -5.05 4.61 5.68 12.30 40.62

RRMSE(%) 7.75 23.34 32.62 34.35 42.14 71.48

REBLUP RB(%) -27.34 -10.95 -5.33 -2.85 3.37 30.09

RRMSE(%) 6.835 26.220 30.180 30.830 36.360 60.690 REBLUP-SBC RB(%) -27.34 -6.23 -1.53 -2.79 0.72 14.38

RRMSE(%) 6.72 22.92 27.98 28.44 33.84 52.83 REBLUP-ABC RB(%) -27.37 -9.03 -4.23 -4.54 -0.22 10.73 RRMSE(%) 6.69 22.90 27.16 27.73 32.87 49.42

86 sampled HUCs-Full calibration

REBLUP-SBC RB(%) -18.96 -5.81 -1.56 -2.40 0.030 14.30 RRMSE(%) 6.83 23.39 29.34 29.54 34.54 55.82

REBLUP-ABC RB(%) -3.16 -0.68 -0.33 -0.30 0.10 1.37

RRMSE(%) 6.83 22.80 28.82 28.89 33.86 53.98 27 non-sampled HUCs

EBLUP RB(%) -75.23 -55.10 -43.16 -10.98 15.18 253.40

RRMSE(%) 5.48 35.91 52.64 55.50 63.20 253.80 REBLUP RB(%) -82.57 -71.95 -62.42 -40.12 -25.30 144.40 RRMSE(%) 7.92 36.32 64.96 56.73 73.29 145.40 REBLUP-SBC RB(%) -81.33 -68.70 -58.90 -37.99 -17.75 132.30 RRMSE(%) 7.55 30.34 61.72 52.66 70.40 133.50 REBLUP-ABC RB(%) -70.78 -43.23 -28.87 -25.02 -0.43 5.61

RRMSE(%) 6.46 8.27 29.19 28.58 43.38 70.80

Figure 3.4: Full Calibration

other proposals in terms of the relative bias. While using this calibration function on the estimated empirical CDF to then derive the statistical functional, results in a estimator that outperform the presented competitors in terms of the Relative MSE. The main rea-son is that the step of linearization through von Mises approximation can also be seen as a bias correction measure.

List of Tables

2.1 Two sample Kolmogorov-Smirnov test for equality of the real wage

distri-bution between the mid and late 2000s. . . 36

2.2 Changes in inequality indices between the mid and late 2000s. . . 36

2.3 Location shift of log hourly real Minimum Wage distribution between the mid and late 2000s. . . 36

2.4 Contribution of minimum wages to the change in wage inequality (ratio indices). . . 41

3.1 Median of the areas’ relative Bias and RRMSE, for REBLUP and different calibration methods (i.e. Partial Calibration). . . 63

3.2 Design-based simulation results using the EMAP data. Summary of Rela-tive Bias (RB) and RelaRela-tive Root Squared Error (RRMSE) over areas. . . 69

A.1 Descriptive Statistics. Total number of observationsn = 130219. . . 76

A.2 Correlation coefficient between continuous covariates . . . 76

A.3 PLM with Identity link: Fitted result for parametric components. . . 77

B.1 Construction of the wage variable for each country . . . 81

B.2 Wage deflator used for each country . . . 81 B.3 Partial effects of log hourly minimum wages on wages of salaried workers . 82

List of Figures

1.1 Transformation or so-called link function S related to four typical popula-tions. . . 7 1.2 Bias-transfer from the parametric to the non-parametric part when

esti-mating the components of a PLM. Solid line indicate the true marginal impacts, dashed lines the expected PLM estimates. Left panel: bias when illness is forced to have a linear impact. Right panel: bias transfer coming from the left panel when age is positively correlated with illness. . . 9 1.3 Simulated link function . . . 10 1.4 Results of the fitted PLM model, in cases where each of the four links is

the true one. . . 11 1.5 Observed responses to “overall life satisfaction” . . . 12 1.6 Life-Satisfaction over the life span(age) in Germany using PLM model. . . 12 1.7 Life-Satisfaction over the life span(age) in Germany using Additive Model. 14 1.8 Marginal effect of age in the additive model with interaction: m(age) on

the left, and m(Age) +s qjm4+j(Age◊Xj)dFX(x) on the right. . . 15 1.9 Marginal effect of age in the PLM for samples separated by gender: m(age)

for female on the left, and for men on the right, respectively. . . 15 1.10 Marginal effect of age using the Poisson link with mirrored responses:

m(Age) on the left, and EX,D

5

G1m(Age) +mx(X) +DT26on the right. . 16 1.11 Estimates of ordered logit model: impact of age on LS on the left, and a

study of the link on the right (along the estimated cut points). . . 17 2.1 Kernel estimate of the log of hourly real wages for mid and late 2000s . . . 35 2.2 The marginal effect of the log hourly real minimum wages on the conditional

and unconditional quantiles of log hourly real wages, appending the data from mid and late 2000s (country: Brazil). The gray shadows are the 95%

confidence intervals. . . 37 2.3 The marginal effect of the log hourly real minimum wages on the conditional

and unconditional quantiles of log hourly real wages, appending the data from mid and late 2000s (country: Mexico). The gray shadows are the 95%

confidence intervals. . . 38 2.4 The marginal effect of the log hourly real minimum wages on the conditional

and unconditional quantiles of log hourly real wages, appending the data from mid and late 2000s (country: Indonesia). The gray shadows are the 95% confidence intervals. . . 39

2.5 The marginal effect of the log hourly real minimum wages on the conditional and unconditional quantiles of log hourly real wages, appending the data from mid and late 2000s (country: India). The gray shadows are the 95%

confidence intervals. . . 39 3.1 The relation between (q) in Âc,q(.) and (“) in Âc,“(.) . . . 54 3.2 3D plot of the “true” RRMSE, and the Bias with respect to the 2 constants

of the Asymmetric influence function that is used for calibration. Here only the result for REBLUP-ABC method for Area(24) is illustrated as an example. The optimum value for these constants is those minimizing the RRMSE. . . 65 3.3 Partial Calibration . . . 68 3.4 Full Calibration . . . 70 A.1 Diagnostic graphs of residuals for PLM (left) and GPLM (right) with

Pois-son link. . . 75 A.2 PLM with interaction: Marginal effect of age, where all interactions with

the continuous variables are included in the model. . . 76

Appendix A

Diagnostics on GSOEP data analysis

Figure A.1: Diagnostic graphs of residuals for PLM (left) and GPLM (right) with Poisson link.

Variable Description Mean sd.

LS Life Satisfaction 7.04 1.83

Age Age 46.33 17.4

Female Gender 0.51 0.5

Disabled Disability status 0.12 0.33

NinH Night in Hospital 2.16 9.98

YofEdu Years of Education 10.94 2.48 LNHI Log of net household income 10.12 0.6 LHS Log of household size 0.96 0.51

German German 0.77 0.42

EM1 Full time employed 0.42 0.49

EM2 Part time employed 0.2 0.4

EM3 Unemployed 0.39 0.49

Married Married 0.64 0.49

Single Single 0.23 0.42

Divorced Divorced 0.05 0.21

Widowed Widowed 0.07 0.25

Table A.1: Descriptive Statistics. Total number of observations n = 130219.

Table A.2: Correlation coefficient between continuous covariates Variable Age NinH YofEdu LNHI LHS

Age 1.00 0.11 -0.05 -0.17 -0.40 NinH 0.11 1.00 -0.03 -0.08 -0.06 YofEdu -0.05 -0.03 1.00 0.23 -0.10 LNHI -0.17 -0.08 0.23 1.00 0.50 LHS -0.40 -0.06 -0.10 0.50 1.00

Figure A.2: PLM with interaction: Marginal effect of age, where all interactions with the continuous variables are included in the model.

Estimate PLM (S.E.) (Intercept) 2.768*** (0.102)

Female 0.034** (0.011)

Disabled -0.753*** (0.016)

NinH -0.017*** (0.000)

YofEdu 0.03*** (0.002)

LNHI 0.466*** (0.012)

LHS -0.284*** (0.015)

German 0.049*** (0.013)

EM1 0.152*** (0.015)

EM2 0.035* (0.015)

Single -0.32*** (0.019)

Divorced -0.58*** (0.022)

Widowed -0.293*** (0.023)

Y4 -0.18*** (0.025)

Y5 -0.239*** (0.025)

Y6 -0.233*** (0.026)

Y8 -0.021 (0.026)

Y9 -0.142*** (0.027)

Y11 -0.367*** (0.027)

Y12 -0.375*** (0.028)

Y13 -0.373*** (0.028)

Y14 -0.514*** (0.029)

Y15 -0.428*** (0.029)

Y16 -0.431*** (0.030)

Y17 -0.504*** (0.030)

Y18 -0.426*** (0.031)

Y19 -0.609*** (0.031)

Y20 -0.687*** (0.032)

Y21 -0.812*** (0.032)

Y22 -0.711*** (0.033)

Y23 -0.778*** (0.034)

Y24 -0.763*** (0.035)

N 130219

úpÆ0.05 úúpÆ0.01 úúúpÆ0.001

Table A.3: PLM with Identity link: Fitted result for parametric components.

Appendix B

Data sources, variable names and definitions- heterogeniety of

marginal effects

For the analysis we used the household or labour force surveys for the different countries.

For Brazil, we use the Pesquisa Nacional por Amostra de Domicilios (PNAD), IBGE for the two years 2005 and 2009; for India we use the Employment-Unemployment survey of the National Sample Survey Organisation for the years 2004-5 and 2009-10; for Indonesia we use the National Labour Force Survey (Survei Angkatan Kerja Nasional) (SAKER-NAS), BPS-Statistics for the years 2005 and 2009 and for Mexico we use the Encuesta Nacional de Ocupacion Y Empleo (ENOE), INEGI for the years 2005 and 2010.

The detailed characteristics of all workers including sex, age, caste/religion, marital status, relation to the household head, education level, employment status (formal/informal), industry and the region are provided in the household or labour force survey. As mentioned earlier, the sample is restricted to the age group 15-64 years and the variables are defined below:

Age: Age of the individual

Gender: Dummy variable, indicating female=1, 0 otherwise.

Marital status: Dummy variable, indicating married=1, 0=single, and others includes living together, widowed, divorced.

Ethnic Group: Ethnic groups for the various countries are the following: Brazil (E1: White, E2: Black, E3: Amarela, E4: Parda, E5: Indigenous); India (E1:

Forward castes, E2: Scheduled tribes, E3: Scheduled castes, E4: Other Backward castes); and South Africa (E1: White, E2: African/Black, E3: Coloured, E4: In-dian/Asian)

Region: dummy variable, indicating urban=1, 0 otherwise. In the case of Mex-ico as rural urban categorisation is not available in the survey, we use the three geographical zone defined in the Mexico survey in place of urban areas.

Education: We classify education into five categories: Illiterate, Literate, Primary, Secondary and Above Secondary. We generate four dummy variables for Illiterate, Literate, Primary, and Secondary and the reference category is ”Above secondary”.

Industry: We aggregate the industries classified under NIC (National Industrial Classification) depending upon the country classification into six industry groups with similar qualitative characteristics: agriculture (comprises agriculture, forestry and fishing); manufacturing (comprises mining and manufacturing); electricity, gas and water; construction; low-skilled services sector (comprises trade, hotels and restaurant, transport and personal services) and high-skilled services sector (com-prises banking and insurance, communication, real estate, business services and public administration). The categorization of the service sector into two groups is justified on the basis of skill and capital requirements. ”Agriculture” is used as reference category and we constructed five dummy variables for each of the other industry groups.

Informal sector: The variable informal sector is defined as following: in Brazil, formal workers are those who have permanent contract and social security benefits, and informal workers are those who do not have then; in India formal workers are those who have at least one of the social security benefits, otherwise they are classi-fied as informal workers; in Indonesia formal workers are those who have permanent work, and workers in casual work and in agriculture are classified as informal work-ers; in Mexico the formal workers are defined as those with permanent contract and informal workers are those without such contracts; and in South Africa formal workers are those with regular contracts and social security benefits, and informal are those without it.

Minimum wages: The legal information used for minimum wages relates to the analysis of the most recent labour legislation, such as labour codes, wage decrees, etc. We do not examine or analyse the judicial decision-making (jurisprudence), which may affect the interpretation of the legislation and, by extension, expand or diminish legal coverage. To assign a legal minimum wage to each worker in the data, we took the official minimum wages from the wage orders or sectoral wage determinations or the official decrees of the respective country. Every worker in the data set was assigned a legal minimum wage if the occupation or sector or industry in which the worker was employed had been legally covered by the official decree of the respective country. In the case of multiple minimum wages in a country, the minimum wage legislation provided information at the different levels. The assignment of a legal minimum wage to a worker was then based on a comparison of the occupational classification used in the household or labour force survey, and the categories in the minimum wage regulations. In most cases, it was not difficult to match the categories with the surveys and minimum wage legislations, as detailed information was available.

Year: A dummy for the two periods, mid and late 2000s.

Table B.1: Construction of the wage variable for each country Country Wage variable Unit of time Log of hourly real wage Brazil wage Daily base lhrwage= ln(8◊W DIwage ) India wgperday Daily base lhrwage= ln(wgperday8◊W DI ) Indonesia wgperday Daily base lhrwage= ln(wgperday8◊W DI ) Mexico wperday Daily base lhrwage= ln(wperday8W DI)

Table B.2: Wage deflator used for each country

Country WDI year1 WDI year2

Brazil W DI2005 = 1 W DI2009 = 1.20 Mexico W DI2005 = 1 W DI2010 = 1.24 India W DI2004 = 0.96 W DI2009 = 1.36 Indonesia W DI2005 = 1 W DI2009 = 1.38

Note: The Wage Deflator Indices (WDI) are calculated by using the ”inflation, con-sumer prices” from the World Bank website. The base year considered is 2005.

http://databank.worldbank.org/data/views/reports/tableview.aspx?isshared=true

B .1 C om pa ri so n of the pa rt ia l e ff ect s

TableB.3:Partialeffectsofloghourlyminimumwagesonwagesofsalariedworkers q0.1q0.5q0.9 OLSCQRUQRCQRUQRCQRUQR BrazilMW0.0530.3100.4620.1480.3420.014-0.935 CI[0.014,0.092][0.268,0.352][0.391,0.533][0.112,0.186][0.294,0.391][-0.202,-0.087][-1.048,-0.823] MexicoMW0.0530.1960.2010.0690.161-0.003-0.500 CI[0.035,0.071][0.161,0.230][0.163,0.239][0.053,0.085][0.140,0.183][-0.030,0.024][-0.540,-0.459] IndiaMW0.2160.3100.2100.3130.6340.211-0.144 CI[0.195,0.238][0.273,0.346][0.189,0.231][0.292,0.334][0.602,0.666][0.181,0.242][-0.167,-0.120] IndonesiaMW0.6631.0460.7620.8160.7570.386 CI[0.650,0.677][1.069,1.023][0.723,0.801][0.801,0.831][0.726,0.789][0.353,0.419][0.346,0.396]

References

A. Alesina, R. Di Tella, and R. McCulloch. Inequality and happiness: are Europeans and Americans different? Journal of Public Economics, 88:2009–2042, 2004.

D. Autor. The polarization of job opportunities in the us labor market: Implications for employment and earnings. Center for American Progress and The Hamilton Project, 23:11–16, 2010.

D. Autor, L. F. Katz, and A. B. Krueger. Computing inequality: Have computers changed the labor market? The Quarterly Journal of Economics, 113(4):1169–1213, 1998.

D. Autor, A. Manning, and Ch. L. Smith. The contribution of the minimum wage to us wage inequality over three decades: A reassessment. American Economic Journal:

Applied Economics, 8(1):58–99, 2016.

G. E. Battese, R. M. Harter, and W. A. Fuller. An error-components model for prediction of county crop areas using survey and satellite data. Journal of the American Statistical Association, 83(401):28–36, 1988.

L. A. Bell. The impact of minimum wages in mexico and colombia. Journal of labor Economics, 15(S3):S102–S135, 1997.

P. Belser and U. Rani. Labour markets, Institutions and Inequality, chapter Minimum wages and inequality. Edward Elgar, UK and ILO, Geneva, 2015.

P. Belser and K. Sobeck. At what level should countries set their minimum wages?

International Journal of Labour Research, 4(1):105–27, 2012.

D. G. Blanchflower and A. J. Oswald. Well-being over time in Britain and the USA.

Journal of Public Economics, 88:1359–1386, 2004.

D. G. Blanchflower and A. J. Oswald. Is well-being u-shaped over the life cycle? Social Science and Medicine, 66(8):1733 – 1749, 2008.

T. Boeri. Setting the Minimum Wage. Journal of Labor Economics, 19(3):281–90, 2012.

M. Bosch and M. Manacorda. Minimum wages and earnings inequality in urban mexico.

American Economic Journal: Applied Economics, 2(4):128–49, 2010.

J. Bound and G. Johnson. Changes in the Structure of Wages in the 1980’s: An Evaluation of Alternative Explanations. American Economic Review, 82(3):371–92, 1992.

R. V. Burkhauser, K. A. Couch, and D. C. Wittenburg. “who gets what” from minimum wage hikes: A re-estimation of card and krueger’s distributional analysis in myth and measurement: The new economics of the minimum wage. ILR Review, 49(3):547–552, 1996.

D. Card. Using regional variation in wages to measure the effects of the federal minimum wage. ILR Review, 46(1):22–37, 1992.

D. Card and A. B Krueger. Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania. American Economic Review, 84 (4):772–793, 1994.

D.E. Card and A.B. Krueger. Myth and Measurement: The New Economics of the Mini-mum Wage. Princeton Paperbacks. Princeton University Press, 1995.

Adrian Chadi. The role of interviewer encounters in panel responses on life satisfaction.

Economics Letters, 121(3):550–554, 2013.

R. L. Chambers. Outlier robust finite population estimation. Journal of the American Statistical Association, 81(396):1063–1069, 1986.

R. L. Chambers and R. Dunstan. Estimating distribution functions from survey data.

Biometrika, 73(3):597–604, 1986.

R. L. Chambers and N. Tzavidis. M-quantile models for small area estimation.Biometrika, 93(2):255, 2006.

R. L. Chambers, H. Chandra, N. Salvati, and N. Tzavidis. Outlier robust small area estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1):47–69, 2014.

R.L. Chambers and R. Clark. An introduction to Model-Based Survey Sampling with Applications. Oxford Statistical Science Series, 2012.

H. Chandra, N. Tzavidis, and R. Chambers. On bias-robust mean squared error estimation for pseudo-linear small area estimators. Survey Methodology, 37(2), 2011.

N. Chun and N. Khor. Minimum Wages and Changing Wage Inequality in Indonesia.

ADB Economics Working Paper Series No.196. Asian Development Bank, 2010.

M. Comola and L. De Mello. How does decentralized minimum wage setting affect em-ployment and informality? The case of Indonesia, volume 57. Wiley Online Library, 2011.

S. Copt and S. Heritier. Robust alternatives to the f-test in mixed linear models based on mm-estimates. Biometrics, 63(4):1045–1052, 2007.

S. Copt and M. Victoria-Feser. High-breakdown inference for mixed linear models.Journal of the American Statistical Association, 101(473):292–300, 2006.

G. S. Datta. Model-based approach to small area estimation. Handbook of Statistics, 29:

251 – 288, 2009. Handbook of Statistics.

A. de Regil.Brazil: In perfect harmony with TLWNSI’s concept. Sustainable Development.

2010.

R. Di Tella, R. MacCulloch, and A. J. Oswald. The Macroeconomics of Happiness.Review of Economics and Statistics, 85(4), 2003.

P. Dick. Modelling net undercoverage in the 1991 canadian census. Survey Methodology, 21(1):45–54, 1995.

R. Dickens and A. Manning. Has the national minimum wage reduced uk wage inequality?

Journal of the Royal Statistical Society: Series A (Statistics in Society), 167(4):613–626, 2004.

R. Dickens, A. Manning, and T. Butcher. Minimum Wages and Wage Inequality: Some Theory and an Application to the UK. Working Paper Series 4512, Department of Economics, University of Sussex, 2012.

J. DiNardo, N. M. Fortin, and T. Lemieux. Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semiparametric Approach. Econometrica, 64(5):1001–44, 1996.

P. Dolton, C. R. Bondibene, and J. Wadsworth. Employment, inequality and the uk national minimum wage over the medium-term. Oxford Bulletin of Economics and

P. Dolton, C. R. Bondibene, and J. Wadsworth. Employment, inequality and the uk national minimum wage over the medium-term. Oxford Bulletin of Economics and