• Aucun résultat trouvé

Endogeneity in high dimension reduction

Dans le document Heterogeneity and international economics (Page 112-152)

Exporting is a complex process requiring the prior development of key firm capabilities, including knowledge of foreign markets (Naud´e, Gries, and Bilkic 2013). In this paper we explore the possibility that factors beyond productivity level are relevant for the interna-tionalization process of firms. Starting with a large set of potential factors a challenge is to properly capture the relationships between the extensive and intensive margins of trade and the many covariates. High-dimensional models in which the number of parameters to be estimated is large relative to the sample size come with a number of analytical chal-lenges. In such cases, standard linear regression and non-parametric regression fail to map the relationship between the independent and the dependent variables. The dimension of covariates are therefore reduced with a (conditional) minimum average variance estima-tion (MAVE; Xia et al.2002). Results indicate that productivity significantly contributes to the extensive margin of trade but not the intensive margin of trade. Nevertheless R&D and innovation collaboration, ICT access and usage also influence the extensive margin of trade, as well as the intensive one.

JEL CODES: C14, C55, F14, C52

Key Words: High-dimension; margins of trade; (conditional) minimum average variance estimation; productivity

3.1 Introduction

Trade economists have been investigating the mechanism underlying a firm’s decision to export, putting into light a strong positive association between productivity and exporting (Caves, Porter, and Spence 1980). In his seminal work, Melitz (2003) combines the exis-tence of firm heterogeneity with fixed exporting costs to model export dynamics. Firms’

productivity level is assumed to vary over a certain range. Only firms that are efficient enough to bear the market entry costs and intense competition will start exporting.

Empirical research in international trade has provided evidence of substantial differ-ences between exporting firms and non-exporting firms. Across a wide range of countries and industries, exporters have been shown to be larger, more productive, more skill- and capital-intensive, and to pay higher wages than non-exporters (Aw, Roberts, and Xu2011;

Baldwin and Gu 2003; Bernard, Jensen, and Lawrence 1995; Eaton, Kortum, and Kra-marz 2011; Wagner2007; Yeaple 2005)1. These findings have led to two hypotheses: self-selection of the most productive firms into the export markets and learning-by-exporting.

Both hypotheses are to show the direction of the correlation between productivity and international activities (see Wagner 2007, for a review). The majority of empirical evi-dence ascertains that only the most productive firms are able to bear the market entry cost of exporting while the least productive firms remain in the local markets (Bernard and Jensen 1999; Clerides, Lach, and Tybout 1998; Roberts and Tybout 1997).

This paper assesses whether determinants beyond firm productivity can explain why some firms are engaged in international trade. Basically economic intuition will suggest a set of variables that might be important to control for but will not exactly identify which variables are relevant or the functional form with which variables should enter the model (Belloni et al. 2017). A large set of variables has been found to correlate with the interna-tionalization process of firms; making readers feel uncertain as the confidence they should place in the findings of any one study. While many candidate regressions have theoretical foundation, the estimated coefficients in these regressions may importantly depend on the conditioning set of other covariates. The lack of clear guidance about what variables to use presents the problem of selecting controls from a potentially large set including raw

1On a related note, exporters share a variety of characteristics with importers which are also bigger, more productive, pay higher wages and are more skill- and capital-intensive than non-exporters and non-importers (Bernard et al. 2007). These results further highlight the international fragmentation of production.

3.1. Introduction 101 variables available in the data as well as interactions and other transformations of these variables. A researcher interested in obtaining precisely estimated policy effects will also consider including additional controls to help absorb residual variation. In those cases performing a variable selection may prove to be interesting.

High-dimensional regression models have a large number of regressors, possibly much larger than the sample size, but only a relatively small number of these regressors are im-portant for capturing accurately the main features of the regression function. Such mod-els arise in modern datasets that commonly include rich information and many measured characteristics available per individual observation. High-dimensional models provide a tool for analyzing complex phenomena and for incorporating rich sources of cofounding information into economic models. While some computationally expensive novel methods can construct predictive models with high accuracy from high-dimensional data, it is still of interest in many applications to reduce the dimension of the original data prior to any modeling of the data2. When the dimension is larger than the number of observations, standard linear regression is not well-behaved and no longer unique as the design matrix becomes singular. Another challenge is that when the relationship between the variables is not linear one need to apply non-parametric techniques. Those techniques do not support models containing large sets of regressors. A major task is therefore to select a subset of predictors that best describes the dependence between the smaller set and the response variable of interest. One can think of a trade-off between goodness of fit and parsimony.

To select the subset of covariates essentially two approaches have been developed.

First, variable selection relies on the idea that among all available covariates, only a few are truly related to the response variable, all the others are redundant and have no real explanatory power. Variable selection aims at determining which covariates have the strongest effects on the response of interest and by effectively identifying a subset of important covariates, it can enhance model interpretability and improve prediction accuracy (Marra and Radice 2011). Examples of such techniques are backward and step-wise selection where one either starts with no predictor or all predictors and iteratively add, respectively drop, them based on some criterion. Other examples are penalized re-gressions for simultaneous variable selection and coefficient estimation such as the Least Absolute Shrinkage and Selecting Operator (LASSO; Tibshirani1996), elastic net or ridge

2Dimension reduction is helpful to visualize the relationships between variables and also to interpret those relationships (Cook 1998).

regression. The second approach is the so-called dimension reduction. In contrast to the variable selection approach, the dimension reduction approach assumes that the response variable relates to only a few linear combinations of the many covariates. Thus, it could happen that all the covariates have explanatory power, but the effect is only represented in a few linear combinations. It is the goal of dimension reduction to identify these few linear combinations (Ma and Zhu 2013).

In high-dimensional models, imposing that the error term and all covariates be uncor-related can be restrictive (Fan and Liao 2014). The issue of endogeneity is particularly worrisome due to the fact that endogeneity of one regressor may prevent consistent es-timation of all the other parameters of the model. Also failing to address potential endogeneity from omitted variables or reverse causation will mislead the estimation of the linear combinations. Recent growth in both the size and dimension of the data has led to a resurgence in analysing instrumental variables regression in high-dimensional settings (Belloni, Chernozhukov, and Hansen 2014; Fan and Liao 2014; Gautier and Tsybakov 2011) where the number of regression parameters, especially those associated with exoge-nous covariates, is growing with, and may exceed, the sample size. An essential issue with high-dimensional data is that when the number of instruments is too large or when one only has instruments for an endogenous regressor which are too weak, the set of restriction to test can become impossibly large (Gautier and Tsybakov 2011).

Using a cross-country firm dataset on firms exporting behaviour, we consider a variety of factors exerting positive or negative impacts on the extensive and intensive margins of trade such as the use and availability of information and communication technologies (ICT) or investments in research and development (R&D). These factors may intervene at the firm level, at the sector level or at the country level. In this setting, firm-level determinants are expected to drive a firm’s propensity to export, while country-level characteristics represent the institutional set-up that allow or impede firms to conduct business internationally. The sector-level are to enhance the capacities of firms and help them leverage their international activities by building up on the country’s institutions and economic development. The reason to discriminate between different levels of analysis is that each one of them entails distinct trade policy implications. It is hence important to determine whether policy interventions at the sector or country level - e.g. in promoting a particular industry - effectively play a role in a firm’s attempt to enter global markets or if

3.1. Introduction 103 internationalization occurs unconditioned on the sector and country within which a firm operates. Dealing with variables at the three different levels of aggregation represents a number of methodological challenges that prevent the direct estimation of any empirical model. To overcome this issue, an effective dimension reduction (EDR) is performed. The specificity of EDR is that it is targeted at regression problems where one typically looks for a lower representation of the conditional distribution of the response variable Y given the covariates X. Without loss of information the conditional distribution of Y|X depends on the covariates only through the coordinates of this projection onto a lower dimensional subspace3. To allow for effective dimension reduction in the case of endogenous covariates we propose to plug-in a control function approach in the EDR context. A general form of endogeneity can be controlled for by the control function approach which is equivalent to a two-step instrumental variable approach. Therefore we first correct the set of covariates to extract the pollution in the covariates caused by our productivity measure. The control variables are defined in such a way that when conditioned on, they make the covariates and disturbances independent (Newey, Powell, and Vella1999). Then we resume with the dimension reduction steps to determine the linear combinations and the central subspace of interest.

In foreword it is important to stress the paper’s boundaries. We do not establish causal links, identify productivity determinants or make policy recommendations. The final goal of this paper is to better understand the heterogeneity among firms while exploring the possibility that other determinants than productivity are relevant in explaining the internationalization process of firms.

Results from this paper are original in two ways. First, this paper aims at under-standing firm-level decisions to start exporting, while taking into account determinants specific to the sector and the country the firm operates in. From the dimension reduc-tion step performed with Minimum Average Variance Estimareduc-tion (MAVE) corrected for endogenous productivity on the extensive margin of trade, four directions are selected from which all levels of aggregation contribute to entry decision of firms. This indicates that external factors play a great role with respect to the extensive margin of trade. It further appears that variables linked to research and development and ICT are particu-larly essential for a firm’s entry into the export sector. Concerning the intensive margin

3You can think of a 3D or higher information summarized in a two-dimensional representation.

of trade six directions are identified. A firm’s performance in exports markets, defined by the amount of its exports, depends mostly on ICT usage at the firm level in the forms of direct communication to suppliers and customers via e-mails, trade openness and in-novation through university-industry collaborations and patent applications that are to safe-guard the position of international firms. The second original result of this paper is that productivity is significantly contributing to the other latent variables for the exten-sive margin of trade once endogeneity is appropriately controlled for. We conclude that, among the set of retained regressors in the model, factors such as ICT, trade openness and R&D deliver the essential prerequisites for both margins of trade.

The major contribution of this paper comes from its powerful and original method-ology. MAVE not only allows for dimension reduction it does so in a regression setting where the relationship between the response and the predictors is taken into account. As such it deals with multicollinearity introduced by the different levels of aggregation in a sensible way and allows for the estimation of non-linear relationships that could not have been studied, at least in this scale, with standard non-parametric regressions. The control function approach we propose correct for endogeneity while conserving the nice properties of MAVE. Further such procedure posits few assumptions on the covariates such that predictors that might have been excluded in the existing literature are now fully integrated.

Explaining why some firms are more productive than other has led to extensive re-search initiated with Melitz (2003) modelling firm heterogeneity. A source of systematic differences among firms is related to their decision to invest in innovation (Aw, Roberts, and Xu 2011; Bernard and Jensen 2004; Cassiman, Golovko, and Martinez-Ros 2010;

Costantini and Melitz 2008). Cassiman, Golovko, and Martinez-Ros (2010) links inno-vation and exporting through increased productivity. More specifically, a firm engaged in innovation exhibits higher levels of productivity which enable it to grow faster than non-innovative firms and therefore to start exporting. The authors explain the supe-rior productivity level of exporting firms with exposure to new technological knowledge and tougher competition. Investments in R&D or new technology, that both raise pro-ductivity and increase the pay-off to exporting, have a positive effect on the firm’s future productivity which reinforces the selection effect of the most productive firm in the export market (Aw, Roberts, and Xu 2011; Bernard and Jensen2004). Cassiman, Golovko, and

3.1. Introduction 105 Martinez-Ros (2010) results highlight the importance of investments in innovation. The authors concentrate on product innovation within the firms without taking into account other potential sources of innovation. Work and research environment plays a key role when it comes to R&D and in particular for small firms who often lack the resources to individually invest in innovation. The present paper specifically introduces environmental parameters like the presence of innovation clusters near the firm to account for possible synergies among them.

Much of the literature has also acknowledged that differences between exporters and non-exporters result from differences in firms’ inputs. For instance, Yeaple (2005) explains firm heterogeneity with the choice of different production technologies as well as the choice to hire different types of workers. In this setting, firms are born identical and heterogeneity among them arises based on their choices of technologies and workers. Firms possessing the newest technologies and skilled workers can overcome the fixed costs associated with exporting. In the presence of such fixed costs, only firms that use the low unit cost technology, and hence are able to sell a large quantity profitably, enter the export market, so that firms that export are larger, use more advanced technology, and pay higher wages than those that do not.

While productivity and other factors internal to the firm are important determinants of a firm’s participation in global markets, a number of external factors also play a key role. As shown by Chaney (2008), firms’ entry decision in the export sectors and export intensity are correlated with country characteristics. Indeed, the quality of legislation, regulations and bureaucracy has a significant impact on the costs incurred by firms and may constitute an incentive or a barrier to investment and trade. For instance, a country’s financial development exerts a significant and positive impact on bilateral flows on both the intensive and extensive margin (Manova 2012). In particular the lack of, or weak access to, finance can strongly inhibit a firm’s development, regardless of the level of per capital income of countries.

Berman and H´ericourt (2010) highlight the importance of the impact of firm’s access to finance on its entry decision in the export market, suggesting that heterogeneity in terms of access to finance may be an important determinant of exporting behaviour at the firm level. For the authors, productivity is only significant in explaining export decision if the firm has external access to finance. Once the firm has entered global

markets however, the state of the financial sector is not a prerogative for firms to remain active internationally. Their work is motivated by the fact that, as opposed to prediction of international trade models, some low productivity firms do export while some high productivity firms do not participate in international trade. In their paper the authors further show that productivity becomes significant for export decisions but only once a given threshold of access to finance is reached. Among the existing literature Berman and H´ericourt (2010) paper shows the most similarities with the present work on the attempt to better comprehend heterogeneity among firms and their decisions to export beyond productivity.

The rest of this paper is organized as follow: Section 2 presents the data of the study and the descriptive statistics. Section 3 develops the empirical strategy in great length.

Results are presented in Section 4 and finally, Section 5 concludes.

3.2 Data

To investigate determinants of the margins of trade we use the microeconomic dataset Enterprise Survey (ES) from the World Bank. It contains information on more than 75,500 enterprises in 132 countries that spans from 2006 to 20144. This dataset covers firm-level information on firm characteristics, trade, finance, regulations, taxes and business licensing, corruption, crime and informality, innovation, labor, and perceptions about obstacles for conducting business. The Enterprise Survey selects firms randomly within each sector and the data is hence representative of a country’s firm population (Berman and H´ericourt2010). The data is in local currency and is therefore converted in US dollars using yearly rates from the International Financial Statistics. The analysis is refined using country- and sector-level data from the World Development Indicators (World Bank) and the Global Competitiveness Index from the World Economic Forum (WEF) to account for the national and sectoral environment of firms.

The set of predictors is described below and summarized in Table 1. As noted by Bernard et al. (2007), exporting is a relatively rare activity. In this dataset, exporting firms represent 19.4 per cent of the sample. Exports account for both direct and indirect exports in US dollars. We define total factor productivity (TFP) as our firm productivity

4Downloaded in May 2015.

3.2. Data 107 measure. It is computed as the residual of a Solow growth model which captures firms production efficiency. It is the share of output not explained by the amount of inputs (capital, labor and equipment) used in production5. In other words, it expresses how efficiently and intensely the inputs are utilized in production.

As noted by Lendle et al. (2016), access to telecommunication infrastructure is essential to reduce information and distribution costs, foster trade, improve market efficiency and increase traders’ income. We consider a firm’s access to ICT through communications with clients via a website and an email account. Both variables take the value one when the firm does possess an website or when it does communicate to clients by emails, and zero

As noted by Lendle et al. (2016), access to telecommunication infrastructure is essential to reduce information and distribution costs, foster trade, improve market efficiency and increase traders’ income. We consider a firm’s access to ICT through communications with clients via a website and an email account. Both variables take the value one when the firm does possess an website or when it does communicate to clients by emails, and zero

Dans le document Heterogeneity and international economics (Page 112-152)

Documents relatifs