• Aucun résultat trouvé

III. The Large International Technology Enterprises (LITE) Database

N/A
N/A
Protected

Academic year: 2021

Partager "III. The Large International Technology Enterprises (LITE) Database"

Copied!
29
0
0

Texte intégral

(1)

41

III. The Large International Technology Enterprises (LITE) Database

SUMMARY

This chapter aims at presenting the LITE database constructed for the purposes of the thesis. The main characteristic of this database is its international dimension.

The core of the database consists of an unbalanced panel of 2676 worldwide

manufacturing firms that have reported positive R&D expenditures over the period

1980 to 1995. The database is representative of more than 50% of the amount of

business enterprise based R&D activities performed in OECD countries over the last

decade. Besides R&D expenses, information has been collected on variables such as

European patent applications by technological fields, net sales, the number of

employees, capital expenditures, raw material expenses and sales by geographic

segments. The last section of this chapter describes three datasets that have been

extracted from the LITE database and on which the empirical analysis performed in

the subsequent chapters is based.

(2)

“We ourselves do not put enough emphasis on the value of data and data collection in our training of graduate students and in the reward structure of our profession. It is the preparation skill of the econometric chef that catches the professional eye, not the quality of the raw materials in the meal, or the effort that went into procuring them.”

Zvi Griliches (1986)

3.1. INTRODUCTION

The LITE

34

database has been constructed with the objective of setting up a representative international sample of the largest firms that reported positive R&D expenditures over the period 1980-1995. The current version of this database consists of an unbalanced panel of 2676 firms in 19 broad manufacturing sectors and 29 countries. Several variables are currently available for each firm and for each year. These variables include information on firms identifiers, number of employees, net sales, capital expenditures and technological information on Research & Development (R&D) expenses as well as patent applications by technological fields. The source for all this information is Disclosure/Worldscope database except for patent applications which come from the European Patent Office. OECD National Accounts and Penn World Tables provide the price deflators for output, capital investments as well as exchange rates. STI, ANBERD and STAN databases from OECD were also used to construct price deflators at the industry sector level and to measure the representativeness of the LITE database.

Among the evaluation instruments available to economists, quantitative methods and in particular econometrics have extensively been used in the empirical literature on R&D. The purpose of the LITE database is to provide reliable indicators at the micro and international levels about economic and technological activities thanks to which it is possible using ad hoc econometric tools to assess quantitatively the links between economic and technological performances of firms. Yet, in our increasingly globalized economies, the question of whether firms from different parts of the world are better performing in their technological activities and economic performance merits a closer attention. It is expected that the international dimension of the database will help us in clearing up this topical debate.

34

The Large International Technological Enterprises (LITE) database has been constructed during my stay at

Berkeley University, October to December 1995. I would like to thank Bronwyn Hall for her helpful

suggestions and support along this project.

(3)

Ch. III: LITE database

43

The chapter is organized as follows. Section 3.2 describes all the variables implemented in the LITE database. Then, the database structure, the data availability, comparability, reliability and representativeness are presented. Section 3.3 discusses patent data. In particular the stress is put on a description of the matching procedure between patents collected from the European Patent Office and firms in the LITE database. Section 3.4 describes the construction of the variables. Finally, Section 3.5 shows the main features of three datasets constructed for the empirical analysis of the following chapters. These subsamples have been extracted from the LITE database. Some concluding remarks follow.

3.2. DESCRIPTION OF THE LITE DATABASE

3.2.1. Database content and variables definition

Table 3.1 displays all the variables currently available in the database

35

. These variables are sorted by category of information and alphabetically. All numeric variables, except the number of employees, the current exchange rates and the total number of patent applications are in current prices and national currency.

Table 3.1

Variable list sorted by category of information

a

Category of information Type Description Data

frequency

Data availability

b

Firm identifier

Char. Cusip number for firm Last fiscal year 2676

Char. Firm’s address Last fiscal year 2676

Char. Firm’s trade name Last fiscal year 2676

Char. Ticker symbol for firm Last fiscal year 2676

Financial information

Num. Capital expenditures Yearly 1207

Num. Depreciation Yearly 1246

Num. Firm’s sales by geographic segment Last fiscal year 2676

Num. Net sales Yearly 1436

Num. Gross property, plant and equipment Yearly 1371 Num. Net Gross property, plant and equipment Yearly 1333 Num. Firm’s sales by product segment Last fiscal year 2676

Num. Raw materials Yearly 662

35

Additional financial information on assets, liabilities, income statement, funds flows statement, stock prices

and per share data has been collected but not implemented in the database yet. Appendix A.3.1 lists this

supplementary information.

(4)

Table 3.1

Variable list sorted by category of information

a

(con’t)

Category of information Type Description Data

frequency

Data availability

b

Technological information

Num. Investments in associated companies Yearly 720

Num. Number of patent applicationsc Yearly 181

Num. Patent applications by technological field (IPC-2 digits)c

Period 1978-1994

1618 Num. Research & Development expenditures Yearly 1190

Supplementary information

Num. Number of employees Yearly 1634

Char. Footnotes Last fiscal year 2676

Char. Major industry group Last fiscal year 2676

Num. Major SIC code (4 digit) Last fiscal year 2676

Num. Sic codes (4 digit) Last fiscal year 2554

Aggregated variables

Num. Gross domestic expenditures on R&Dd Yearly 29

Num. Gross domestic productd Yearly 29

Num. Industry R&D (SIC-2 digits)e Yearly 15 (37)

Num. Industry sales (SIC-2 digits)e Yearly 15 (37)

Num. Industry value added (SIC-2 digits)e Yearly 15 (37)

Num. GDP deflatorf Yearly 29

Num. Investment deflatorf Yearly 29

Num. exchange ratesf (national currency/$) Yearly 29 notes: a) all variables come from Worldscope (1995) unless otherwise specified,

b) average # of firms for which information is available for each year of the period 1980-1995, for aggregated data, # of countries (and industry sectors),

c) European Patent Office (1996), d) OECD’s STI database (1996a),

e) OECD’s ANBERD and STAN databases (1996b, 1996c),

f) OECD (1996d), National accounts and Penn World Tables (1998, and Summer and Heston (1991)).

Table 3.2 defines all variables present in the LITE database. For the most part, the definitions are based on the Worldscope data definition guide (1994), where more details can be found.

Table 3.2

Definitions of variables in the LITE database

a

Variable Definition

Capital expenditures funds used to acquire fixed assets other than those associated with acquisitions Cusip number for firm national security identification number for US and Canadian companies Depreciation process of allocating the costs of a depreciable asset to the accounting periods

covered during its expected useful life to a business. It is a non cash charge for use and obsolescence of an asset

Firm’s address location of the corporate offices of a company. It includes building of the company’s offices, street address, city, state, province, county or district, zip code and nation

Firm’s sales by geographic segment

total revenues of a company by geographic segment

Firm’s sales by product segment total revenues of a company by product segment

(5)

Ch. III: LITE database

45 Table 3.2

Definitions of variables in the LITE database

a

(con’t)

Variable Definition

Firm’s trade name legal name of the company

Footnotes allow the user to provide additional information about the variables Gross property, plant and

equipment

tangible assets with an expected useful life of over one year which are expected to be used to produce goods for sale or for distribution of services

Investments in associated companies

long term investments and advances in unconsolidated subsidiaries and affiliates in which the company has a business relationship or exercises control.

It includes joint ventures

Major industry group firm’s assignment within its major industrial activity. WORLDSCOPE database distinguishes 25 industry groups which are: aerospace; apparel; automotive;

beverages; chemicals; construction; (general) diversified; drugs, cosmetics &

health care; electrical; electronics; financial; food; machinery & equipment;

metal producers; metal product manufacturers; oil, gas, coal & related services;

paper; printing & publishing; recreation; retailers; textiles; tobacco;

transportation and utilities Net property, plant and

equipment

property, plant and equipment less accumulated reserves for depreciation, depletion and amortization

Net sales gross sales and other operating revenue less discounts, returns and allowances Number of employees number of both full and part time employees of the company

Number of patent applications number of patents applied in a given year Number of patent applications by

patent classes

118 technological classes according to the International Patent Classification Raw material inventory of raw materials or supplies which indirectly or directly enter into the

production of finished goods Research and development

expenditures

all direct and indirect costs related to the creation and development of new processes, techniques, applications and products with commercial possibilities.

These costs are usually classified into three main categories which are fundamental or basic research, applied research and development of new products, services or processes. These costs exclude the subsidies or contributions of Government, customer or other firms

Sic code standard industry classification which covers all economic activities

Ticker symbol for firm symbol used to identify the company on the stock exchanges where it is listed.

For non US firms, Reuters’ symbols are used except for Japan where Quick Code is used

note: a) Worldscope (1994)

3.2.2. Database structure, reliability and representativeness

Table 3.3 exhibits the data availability for each variable and for each year. Over the period 1986 to 1994, information is available for 69.1% of firms on average. Table 3.4 displays the sample breakdown according to countries and industry sectors. The United States is the most represented country with 1190 firms out of 2676 (44.5% of the sample). The United Kingdom follows with 607 firms (22.7%). Japan comes third in rank with 340 firms or 12%.

Besides British firms, other European firms account for 14.9 % of the sample (400 firms).

Finally the ‘rest of the world’ is mostly composed of Canadian and Australian firms (105 firms)

and represents 5.2% of the sample. In terms of industry sectors, Table 3.4 indicates that

(6)

Electronics is the most important industry with about a quarter of the sample (632 firms).

Machinery and equipment, Chemicals and Drugs represent all together another 28% of the sample.

One of the main features of the LITE database is its international dimension. The comparability of different firms from different countries creates some issues which do not arise when firms of a same country are considered. The most questionable element in such an exercise probably lies in the different accounting practices carried out in the different countries.

Indeed, the typical multinational is itself an amalgam of several firms in several countries operating simultaneously under different sets of accounting conventions and tax reporting laws. Since data in the LITE database are based on firm’s consolidated accounts, it is not obvious how subsidiaries and affiliates of a parent company should be aggregated. For instance, factors like terminology of financial statement, legal system, nature of corporate management, ownership and financing or taxation might be substantially different from one country (firm, subsidiary) to another. Words like ‘short term’, ‘long term’, ‘current’,

‘operating’ or ‘extraordinary’ can variously be defined from company to company, and in different accounting systems. These differences can be to a great extent reduced by carefully examining the terminology used in reported financial information. This is precisely what is done by the analysts of Disclosure/Worldscope database. For instance, because each Worldscope data item is precisely defined in a generic standard way, any reported items which deviate from this definition is standardized to increase comparability.

Besides this data comparability problem, a second important question weakening the accuracy of any analyses concerns the problem of measurements errors into the data. This risk of missmeasurement is somewhat attenuated to the extent that most of the information comes from an unique statistical source which is the Disclosure/Worldscope database. Moreover in order to insure a maximal consistency, the input of all data items in this database are checked systematically thanks to more than 600 computer tests.

Tables 3.5 and 3.6 show the representativeness of firms in their national economies in

terms of net sales and R&D. It follows that the representativeness of these variables is low in

the early ‘eighties’ and more important in some countries, e.g. the four largest economies in

the European Union, Japan and the United States, than others, e.g. Australia and Belgium.

(7)

Table 3.3

Data availability: % of firms in the LITE database on which information is available

Variable Year 1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 average

# employees 10.1 10.9 12.8 14.5 22.5 96.5 97.6 95.7 93.8 91.4 89.9 88.6 85.7 85.4 77.7 3.9 61.1

Net sales 8.5 9.5 11.0 12.4 17.1 64.6 68.7 75.9 81.6 83.5 86.4 87.4 85.2 85.3 77.7 3.9 53.7

Gross property, plant & equip. 7.8 8.5 10.2 11.4 14.8 57.9 62.7 71.4 77.1 79.6 83.7 85.2 83.5 84.3 77.5 3.9 51.2 Net property, plant & equip. 7.1 8.5 10.0 11.4 12.7 17.3 65.7 75.1 80.8 82.8 86.2 87.3 85.2 85.3 77.7 3.9 49.8 Depreciation 8.2 9.0 10.8 12.1 16.6 55.9 58.0 65.5 70.4 71.7 74.6 76.0 73.9 73.7 64.8 3.9 46.6 Capital exp. 8.1 9.0 10.6 12.0 16.3 55.8 57.9 65.6 73.1 71.6 70.8 70.4 68.6 68.5 59.9 3.8 45.1 R&D exp. 6.8 7.4 9.0 10.2 13.5 43.8 45.9 52.0 59.0 65.3 75.1 79.4 79.8 83.1 77.6 3.9 44.5 Investm. in ass. companies 1.4 1.9 2.4 2.9 5.2 29.5 33.5 36.7 39.5 42.8 47.3 48.5 47.5 46.9 43.1 1.4 26.9

Raw materials 0.0 0.0 0.0 0.0 0.1 1.5 4.3 9.9 16.1 40.0 64.8 66.7 65.2 65.6 58.2 3.4 24.7

average 6.40 7.2 8.5 9.7 13.2 47.0 54.9 60.9 65.7 69.9 75.4 76.6 75.0 75.3 68.2 3.6 44.8

Table 3.4

Geographic and sectorial breakdown of firms in the LITE database

Sectora Country US JP UK FR DE CA AU CH IT FI SE NL NO SK DK SA AT IR MA BE NZ CL SP GR BR LU HK SI TW TOTAL

Aerospace 22 2 9 4 2 1 1 41

Apparel 6 2 2 10

Automotive 35 27 10 8 9 1 2 2 3 2 2 101

Bev., food & tob. 24 23 14 2 1 3 2 3 5 6 1 2 2 4 2 1 1 96

Chemicals 78 89 24 6 14 7 5 5 2 1 2 5 3 3 2 2 1 1 2 1 1 1 1 1 1 258

Construction 25 82 22 3 3 3 3 1 1 2 1 3 1 2 2 154

Diversified 59 5 16 6 6 5 3 3 4 1 2 1 1 5 1 1 119

Drugs, & health 116 44 25 8 4 2 1 3 2 2 4 1 3 3 1 1 220

Electrical 33 29 15 4 5 1 1 3 3 1 2 97

Electronics 413 84 68 9 9 11 7 7 1 2 2 7 5 4 1 1 1 632

Machinery & equipm. 94 90 34 3 17 4 2 9 5 4 4 5 3 2 1 2 1 280

Metal producers 23 14 3 4 2 7 11 1 2 1 1 1 1 71

Metal products 34 42 17 6 3 2 5 1 2 2 1 1 2 118

Miscellaneous 128 26 46 12 2 7 1 1 4 1 1 1 1 1 1 1 234

Oil, gas, & coal 28 8 8 3 3 6 4 1 1 1 1 3 1 1 1 1 1 72

Paper, publ. & print. 28 14 9 1 4 1 3 7 2 1 2 2 1 1 76

Recreation 30 11 7 1 3 52

Retailers 2 1 2 5

Textiles 12 14 9 1 1 1 1 1 40

TOTAL 1190 607 340 80 80 60 45 40 36 31 22 22 22 20 17 15 10 10 7 4 4 3 3 2 2 1 1 1 1 2676

note: a) see Worldscope (1995) for a definition

(8)

Table 3.5

Representativeness of the LITE database: net sales as percent of GDP

a

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 Australia 0.1 0.7 0.6 0.8 11.8 13.2 13.8 14.7 15.6 17.6 17.8 16.5 16.9 16.8

Austria 0.5 0.7 4.9 4.7 10.6 7.3 10.1 7.7 7.1 8.6 6.9 7.4 6.0 5.2

Belgium 0.0 0.0 0.0 0.0 5.7 5.2 5.3 5.6 5.5 5.1 4.8 4.6 4.3 4.2

Canada 3.1 3.2 3.0 2.5 11.4 10.4 10.5 9.5 8.9 9.1 8.6 8.9 9.0 9.7

Denmark 1.6 1.8 1.8 2.2 5.2 5.0 4.8 5.1 4.6 5.9 5.9 6.0 6.0 5.3

Finland 4.6 5.0 5.2 6.1 44.5 42.6 43.1 44.4 46.4 47.1 46.5 52.5 57.7 53.6 France 1.8 1.5 2.5 8.4 24.3 21.9 23.9 25.9 28.5 27.5 28.1 27.0 25.3 19.1 Germany 1.0 1.5 1.7 8.0 36.6 32.8 32.3 33.8 34.6 33.1 30.4 28.5 26.6 21.5

Greece 0.0 0.0 0.0 0.0 0.1 0.1 0.1 0.2 0.3 0.3 0.4 0.4 0.5 1.0

Ireland 0.0 0.2 0.2 1.6 7.5 7.8 8.9 10.5 12.1 13.3 11.5 12.3 12.9 12.2

Italy 0.2 0.3 0.4 4.6 5.0 5.1 7.0 9.9 9.4 12.8 12.0 11.9 11.8 1.0

Japan 0.1 0.2 0.2 0.3 28.0 30.0 26.2 27.3 30.2 32.2 34.1 34.5 33.5 31.9 Netherlands 1.3 1.5 1.5 2.2 81.0 57.9 54.8 55.2 58.9 57.2 55.6 51.5 51.4 49.6 Norway 2.5 2.6 2.7 3.9 25.3 26.7 27.3 25.7 26.9 27.3 27.4 26.7 26.9 27.0

Spain 2.9 2.6 2.7 2.9 2.3 1.9 0.1 0.1 0.0 0.0 0.0 0.0 0.0 0.0

Sweden 2.6 5.8 5.8 5.9 35.5 34.6 35.9 36.3 36.0 32.0 27.9 28.9 35.2 40.4 Switzerland 0.3 0.3 0.3 0.7 44.6 38.3 38.3 41.6 45.1 40.9 41.6 43.8 45.0 41.6 United Kingdom 2.2 3.7 3.8 6.5 46.4 47.1 46.9 47.7 51.3 50.6 47.3 45.0 46.1 43.5 United States 4.9 4.7 4.7 5.3 35.0 32.5 32.8 33.8 33.1 34.1 32.7 32.0 31.5 31.4

note: a) OECD (1996c) and LITE database

Table 3.6

Representativeness of the LITE database: R&D expenses as percent of total domestic R&D expenditures

a

1981 1982 1983 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994

Australia 0.0 0.0 1.5 4.5 5.9 8.6

Austria 0.0 0.0 0.0 0.0 5.0 4.7 4.7 3.7 4.4 6.1 6.4 9.5 7.6 7.5

Belgium 0.0 0.0 10.9 11.5 11.8 13.2 13.0 11.4

Canada 2.1 2.1 0.6 0.6 14.5 15.5 18.6 17.0 16.7 20.7 19.2 20.4 20.4 23.1 Denmark 0.0 2.3 2.6 3.2 12.3 11.3 11.0 12.5 13.5 15.9 15.9

Finland 0.0 3.4 5.9 31.3 31.3 36.8 40.7 40.2 37.7 33.3 37.6 43.0

France 0.2 0.6 0.9 18.4 45.3 43.2 40.6 40.8 38.6 36.4 44.9 44.3 42.6 Germany 0.2 1.1 2.1 6.8 42.7 43.3 46.0 55.5 58.0 65.4 63.4 60.5 56.5

Greece 0.0 0.0 0.0 0.0 0.0 0.0 0.3 1.6 1.8

Ireland 0.0 0.0 0.0 0.0 0.3 1.7 2.7 3.0 5.5 5.4 5.7 6.7

Italy 1.2 2.0 2.3 16.9 15.6 16.2 20.8 27.5 27.8 35.6 33.9 27.7 28.4 2.8 Japan 0.2 0.2 0.2 0.3 22.7 29.9 28.1 30.0 33.0 33.9 36.7 39.4 40.0 Netherlands 0.2 0.2 0.2 0.9 70.9 66.2 64.5 73.9 84.7 82.0 78.4 74.9

Norway 1.8 2.2 5.3 8.1 16.4 20.1 21.4 22.1 26.7

Spain 0.0 0.0 0.0 0.0 0.3 1.1 0.3 0.2 0.0 0.0 0.0 0.0 0.1 0.1

Sweden 3.9 13.3 49.1 47.6 55.3 54.4 60.0

Switzerland 0.9 1.0 57.4 72.9 91.1

United Kingdom 1.6 3.0 25.5 26.9 31.4 31.4 45.2 52.1 47.7 49.4 48.2 United States 4.5 4.9 4.9 5.7 41.2 40.9 41.0 42.9 43.7 44.2 45.3 45.5 46.7 47.3

note: a) OECD (1996b) and LITE database

(9)

Ch. III: LITE database

49

3.3. PATENTS

3.3.1. Source, strengths and weaknesses of patent data

Besides the Worldscope/Disclosure database, the European Patent Office (EPO) is the second major source of information. This patent institution exists since 1978. A single patent application provides protection in up to 17 European. In 1993, more then 53.000 European applications had been filed. Firm patent applications across technological fields, according to the International Patent Classification (IPC), have been collected for the entire period from 1978 to 1994 from the ESPACE-BULLETIN database published by the EPO and available on CD-ROM. Furthermore, for a subsample of firms in the LITE database, the yearly number of patent applications for these firms has also been compiled. It should be noted that although all 2676 firms in the LITE database have reported R&D expenditures during the period under investigation, no less than 1058 firms (mainly non European firms) did not apply for any patents to the EPO over the period 1978-1994. Because of this ‘zero issue’, it has not been possible to build the index of technological closeness that is associated with the construction of the spillover variables

36

. Hence for more than 39.5% of firms in the LITE database, the spillover variable could not be constructed.

Among several indicators of S&T activities available to economists, patent statistics have probably been the most used. Through several economic studies related to the measurement of technological activities, patent statistics have proved their economic meaningfulness

37

. Nevertheless, like other technological indicators, patent statistics have their own weaknesses. The same weight given to patents by simply counting them is an important drawback of this indicator. Actually, the pure technical content as well as the intrinsic economic value of a patent may vary widely among patents. Not all inventions are patented, nor all are patentable, and other existing methods in appropriating an innovation such as industrial secrecy may be preferred. The propensity to patent may change substantially over time and across countries not to mention among technological sectors. For example, it is generally recognized that the propensity to patent is important in sectors such as machinery or

36

See Section 6.3.2 of Chapter 6 for more details regarding this variable.

37

For the relevance of patent statistics as an indicator of Science and Technology activities, see for instance

Bound & al. (1984), Basberg (1987), Glisman and Horn (1988), Griliches (1990) or Archibuggi and Pianta

(1992).

(10)

chemicals but very weak in aerospace and in software since in the latter industries innovation are more easy to copy.

Most studies consider patent statistics coming from the US Patent Office. This office has often been described as the most adequate since the United States is the most important market for inventions at the international level. Yet, an important methodological issue, when using patent statistics as a technological indicator, is their comparability at the international level. For instance, patenting regulations differ among different national and international patent offices and over time, making comparisons more difficult. This comparability issue arises in the thesis since the analysis is based on the European patent office, that is European patent data are considered for non European firms. Indeed, aggregate data suggest that American and Japanese firms apply for and obtain far fewer patent grants from the European patent office than from their own domestic patent offices. Hence using European patents to infer technological performances or technological proximities of American and Japanese firms may be quite distorted and incomplete. Hence, it would be interesting to look at other patent offices and to see how much we are ‘missing’ by considering European patents only. Yet in the absence of a ‘global’ patent office, we have no choice but to use national or regional patent data.

On the other hand, ‘foreign patents’ are sometimes viewed as better indicators of technological activities. Archibuggi and Pianta (1992), talking about a ‘domestic’ market effect

38

, showed that using patent statistics of different patenting institutions to measure a country’s technological pattern can lead to substantially different conclusions. The authors elaborated correlation coefficients between indexes of technological specialization of OECD countries on patents in the US Patent Office versus other patenting institutions. Their main conclusion is that “domestic patenting is an unreliable indicator of a country’s specialization, as it is distorted by a large number of inventions of lesser significance, which are not extended abroad, and are aimed only at protecting the domestic market from foreign competition ”. For Basberg (1987) too, foreign patent are, on average expected to be of higher quality than domestic patent to the extent that “ it is reasonable to assume that only inventions with significant profit expectations in a larger market will be patented abroad because of time and costs involved in such processes ”.

38

i.e. an American inventor will tend to apply for more patents at the US Patent Office, rather than at a foreign

patent institution like the EPO.

(11)

Ch. III: LITE database

51

Hence, one advantage of EPO data is that they are more accurate in describing the technological pattern of US firms than the US Patent Office, because of the domestic market effect discussed before. On the other hand, because of the same argument, attention should be paid when interpreting the technological pattern of European firms. Another advantage of European patent statistics is that they are classified by date of application rather than by date of issue. According to Jaffe (1986) and Tong and Frame (1994), patents classified by date of application are preferable because they reflect the moment when a firm realizes having generated an innovation and because of the existence of long lags in a patent’s application process

39

.

3.3.2. Matching of patents to firms and IPC classes

A major task in assembling the LITE database has been the matching of patents from the ESPACE-BULLETIN database to firms of the WORLDSCOPE/DISCLOSURE database.

Two difficulties have been encountered. First, patents are assigned to firms on the basis of their names which are not always the same in EPO and Worldscope data sources (for instance ‘Co.’

instead of ‘Corporation’, ‘Incorporated’ or ‘INC.’ and other such changes or abbreviations).

For Japanese firms, this issue is even more severe since these firms are sometimes taken out under their official English name, e.g. “ Honda Motor Co. LTD. ”, and sometimes under the Romanized transliteration of their Japanese name, e.g. “ Honda Giken Kogyo Kabushiki Kaisha ”.

Second, many large firms have several R&D performing subsidiaries in several countries and it is not obvious to link the patents applied by these subsidiaries to the parent company. Ideally, one has to have a ‘mapping’ of the main firms company to their subsidiaries and affiliates.

However, it is not easy to construct an accurate mapping since, by essence, this mapping changes over time through the process of merger and acquisition.

Thanks to the software provided in the ESPACE-BULLETIN database, it has been possible to minimize these issues to a great extent. In a first step, patents were assigned to firms on the basis of their generic name. For instance, for “ E.I. Du Pont De Nemours and Company ”, when searching for the word ‘Du Pont’, the software retrieved 4599 patent documents (from 1978 up to 1994). Examining more in detail the firm’s full names reported in

39

On average, according to the EPO, it takes just over three years between the filing of the patent application

and the patent grant.

(12)

these documents, it appeared that 4191 patents were assigned to “ E.I. Du Pont De Nemours and Company ” while 123 documents were attributed to “ Du Pont Canada ”, 120 to “ Du Pont Merck Pharmaceutical Company ”, 77 to “ Du Pont Deutschland ”, 21 to “ Du Pont UK ” and 67 to other inventors

40

. These last companies (except “ Du Pont Merck Pharmaceutical Company ”) are clearly foreign subsidiaries of the US parent company. Hence, the patent application of these firms have been consolidated with the ones applied for by ‘E.I. Du Pont De Nemours and Company’.

In a second step, this procedure has been repeated for each firm of the sample. For about four fifths of the sample there was only one firm name in the retrieved documents. For the rest, firm names which could be identified without any doubts as subsidiaries have been included in the matching process of generic names.

Thanks to the classification of patent data by patent fields or technological classes, it is possible to measure the technological proximity between firms by characterizing their positions in the technological space. The two digit International Patent Classification (118 classes) allows one to identify the technological fields of patent applications. In order to ease calculations, the 118 IPC classes were grouped into 50 broader classes (see appendix A.3.2).

The methodological framework for constructing the technological proximities of firms will be the scope of Chapter 6. Appendix A.3.3 gives an picture of how patents by IPC field are distributed and concentrated (Herfindahl index) across geographic areas as well as industry sectors.

3.4. CONSTRUCTION OF VARIABLES

Several variables have been constructed for the purpose of the subsequent empirical analysis. These variables are the firm’s own R&D capital, the R&D intensity, the stock of physical capital, the R&D spillover stock available to the firm. Furthermore, in order to proxy industry, and geographic specific effects as well as technological opportunity factors, several sets of dummy variables have also been constructed. Finally, in order to allow for a comparison of all these variables across industries, countries and over time, several deflators have been considered and all nominal variables have been converted into 1990 constant dollars. The purpose of this section is to expose the construction of these variables.

40

These applicants are for the most part, individuals whose last name is ‘Du Pont’.

(13)

Ch. III: LITE database

53

(i) firms’ R&D and physical capitals

The stock of R&D capital has been built on the basis of the permanent inventory method originally proposed by Griliches (1979). Actually this method is the most commonly used for constructing the firm’s knowledge capital. This method assumes that the current state of knowledge is a result of present and past R&D expenditures:

K

it

= − ( 1 δ ) K

it1

+ R

it

( ) ( )

= R

it

+ − 1 δ R

it1

+ − 1 δ

2

R

it2

+ ... (3.1)

( )

= ∑ −

=

1

0 τ

δ

τ

R

it τ

where: K it is the knowledge capital or own R&D stock of firm i at time t, R it is the R&D expenditure, and

(1-δ) represents the rate of depreciation of the knowledge capital.

This formulation raises at least three questions. First, we have very little idea about the magnitude of the depreciation rate (should it be constant across firms and time periods).

Hence, it is not clear which value to retain. Second, since the available history of R&D is usually not very long, we need a way to construct the initial knowledge stock. Finally, constructing the knowledge stock as in equation 3.1 supposes a particular distribution of the R&D effects over time. Regarding the value of the depreciation rate, Bosworth (1978) has estimated, on the basis of patent renewal data, a value ranging from .1 to .15. Indeed, most studies assume a depreciation rate of 15%. Moreover, several authors (Griliches and Mairesse, 1983, 1984; and Hall and Mairesse, 1995; for instance) have experimented with different values of δ and report very small changes if not at all in the estimated effects of R&D capital

41

. The initial knowledge capital is constructed as in equation 3.1 and by assuming a growth rate of presample R&D equal to g:

( ) ( ) ( )

K R

g

R

i i

g

i

0 0

0

1

0

= 1 −

− =

=

+

δ δ

τ

τ

(3.2)

41

This arises from the log-log functional form of the Cobb-Douglas function used in these studies. Indeed, the

log of K varies as the log of R in the cross section when the depreciation rate and growth rate are roughly

constant over time at the firm level. In that case, log K

it

≈ log [R

it

/(g+δ)] = log R

it

- c where c = log (g+δ). This

will not be true if ones does not take the log of K.

(14)

Here also, a presample growth rate of 5% is usually assumed. As Hall and Mairesse (1995) point out, the precise choice of growth rate only affects the initial stock which in turn declines in importance as time passes. Regarding the timing of R&D effects, it is to be expected that R&D activities do not have an immediate impact on firms’ economic performances. Pakes and Schankerman (1984), for instance, find a mean gestation lag of R&D comprised between 1.2 to 2.5 years. Ravenscraft and Scherer (1982) estimate a longer lag of about 4 to 6 years.

In order to get around the issues associated with the construction of the R&D stock of knowledge, an alternative approach, suggested by Griliches (1973) and Terleckij (1974), is sometimes used. This approach directly estimates the rate of returns to R&D instead of its elasticity. To this end, the firm’s own R&D capital is replaced by the firm’s R&D intensity measured as the ratio between the level of R&D expenditures and the firm’s output, i.e. net sales or added value.

The capital stock measure corresponds to the net property, plant and equipment of firms. ‘Net’ means that accumulated reserves for depreciation, depletion and amortization are not included. Information on annual capital expenditures is available as well. Hence, it may be possible to construct a capital stock according to the perpetual inventory method. However, this approach requires a knowledge of the rates of depreciation of physical capital which vary across firms and over time. Since this information is unavailable and capital expenditures are missing for some firms and years, this approach has been considered only to assess the sensitivity of results with respect to this alternative construction of the physical capital.

(ii) geographic areas, industry and technological sectors, and dummy variables

In order to pick up market factors or industry specific effects as well as technological

opportunity and geographic effects, three sets of dummy variables have been constructed by

assigning each firm to its main industrial sector, its technological cluster and the geographic

area in which the firm is domiciled. As far as the industry dummies are concerned, the industry

sectors corresponding to these variables have been chosen so as to allow for a concordance

between the SIC (classification retained in the Worldscope database) and the ISIC

(15)

Ch. III: LITE database

55

(classification retained in OECD databases, see Appendix A.3.4). The procedure allocating firms to technological clusters is described in Chapter 6.

More specifically, the market dummies (MD), the technological dummies (TD), and the geographic dummies (GD) have been constructed as follows:

MD if i M

i

=  else

  1

0, (3.3)

TD if i T

i

=  else

  1

0, (3.4)

GD if i G

i

=  else

 

1 1

0, (3.5)

where: M = industry sector AIRC, CHEM, COMP, CONS, DRUG, ELEC, ETRO, FABR, FOOD, INST, MACH, META, MISC, OTHE, PAPE, PETR, RUBB, SOFT, STON, TEXT, TRAN, VEHI, WOOD}

42

,

T = technological cluster {CT1,...,CT18}

43

, and

G1 = geographic region Australia, Austria, Belgium, Canada, Chile, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Japan, Luxembourg, Malaysia, Netherlands, New Zealand, Norway, Singapore, South Africa, South Korea, Spain, Sweden, Switzerland, Taiwan, United Kingdom, United States }

(iii) R&D spillovers

Several approaches have been developed in the literature in order to measure the potential pool of R&D spillovers. The alternative ways of formalizing these approaches as well as a discussion of their strengths and drawbacks is provided in Section 6.2.1 of Chapter 6. To sum up, these approaches assume that the R&D spillover stock for a firm can be proxied by performing an unweighted or a weighted sum of the R&D undertaken by all other firms. In the second case, the weights represent a measure of ‘technological proximity’, that is the degree of similarity of R&D activities between each pair of firms. Furthermore, it is possible to ‘split’ the total stock of spillovers into different components in order to investigate the impact on firms’

technological and economic performances of spillovers generated by firms belonging to the same industry sector, technological field of specialization or geographic area.

42

See Appendix A.3.4 for a definition.

(16)

In the present work, two main approaches have been considered. In the first approach the spillover variable is constructed as the manufacturing sector-based amount of R&D reported in ANBERD database less the firm’s own R&D investment. It should be noted that this approach gives an identical weight to the R&D of all other firms operating in a same industry sector and that it only considers intra-sector spillovers. Jaffe (1986, 1988) developed a more sophisticated methodology in which the R&D spillovers are constructed by positioning the firms in the technological space. The distribution of the firm’s patents over patent classes is used in order to characterize their positions in the technological space. The closer two firms are in such a space, the more potential spillovers are supposed to be important. This second approach has been implemented and further developed in Chapter 6.

More specifically, the total unweighted and weighted stocks of R&D spillovers (TUS and TS respectively) have been performed as follows:

TUS

i

= R i _R

i

(3.6)

and

TS

i

P K

ij j

j i j

= ∑

(3.7)

where: i and j are firms of the sample,

R _i is the total amount of R&D performed in firm’s i industry, R i is firm’s i R&D expenditures,

P ij is the technological proximity between firms i and j, and K j is firm’s j own R&D capital.

The different components of the total stock of R&D spillovers, i.e. the local national stock (LNS), the local international stock (LIS), the external national stock (ENS) and finally the external international stock (EIS) have been performed in a similar manner as equation 3.7

44

:

for LNS: i and j G2, i and j T, LIS: i G2, j G2, i and j T, ENS: i and j G2, i T, j T, EIS: i G2, j G2, i T, j T.

43

See Section 6.3.3 of Chapter 6 for a definition.

44

By local national stock, we mean the spillovers generated by firms which are specialized in similar

technological activities and which are located in the same country than the recipient firm.

(17)

Ch. III: LITE database

57

where: G2 = geographic area Europe, Japan, United States, Rest of the World}

45

T = technological cluster {CT1,...,CT18}

(iv) common currency and deflators

In order to allow for comparison, all variables have been converted to constant 1990 dollars. R&D expenditures have been deflated using the GDP deflators of the respective countries, while the deflator of physical capital has been used for the capital stock. Regarding net sales, four different deflators have been considered. The first three deflators are the price index of GDP, the price index of total manufacturing added value, and price indexes at the industry level (ISIC-2 digits). Furthermore, a substantial number of firms in the sample have more than one product line at the SIC two digit level, and, are multinational, that is, a large amount of their sales is performed outside the domestic market. Since these firms are

‘multiproducts’ and have subsidiaries in several countries, the use of domestic output price indexes at the 2 digit industry level for each country may not seem to be a relevant approach for deflating net sales. Instead, thanks to the availability of the shares of sales performed in the home country and abroad, a more general price index has been computed using the following formula:

( )

WPI

i

d PI

i i

d

i

w PI

j j

j i

= + −

n

1 and w VA

VA

j

j j j i

=

n

(3.8)

where: WPIi is the weighted manufacturing price index used to deflate net sales of firm i, di is the share of sales of firm i performed in its home country,

PIi is the price index of added value for the whole manufacturing sector in firm’s i home country,

j = 1,...,n and n is the number of countries considered in the LITE database, wj is the share of added value of country j with respect to the sum of added value of the (n-1) countries.

Figure 3.1 depicts the industry sector based distribution of the share of sales performed in the firms’ home country

46

. On the whole, firms operating in the textile and aircraft industries appear to be more oriented towards their domestic market while firms selling electronic equipment and petroleum products are more ‘open’ to foreign trade.

45

Regarding the split of R&D spillovers into national and international components, a second set of variables has been constructed by considering European countries as specific geographic regions.

46

Firms of the S625 dataset described in Section 3.5.2.

(18)

Figure 3.1

Industry sector based distribution of the share of sales performed in the firms’ home country

Petroleum Electronics Fabricated metal Drugs Chemicals Food Software Instruments Stone Machinery Computer Rubber Motor vehicles Primary metal Paper Electrical Aircraft Other Textile

0 10 20 30 40 50 60 70 80 90 100

Petroleum Electronics Fabricated metal Drugs Chemicals Food Software Instruments Stone Machinery Computer Rubber Motor vehicles Primary metal Paper Electrical Aircraft Other Textile

source: sample S625

Regarding the geographic distribution of net sales, the average share of sales performed by Swedish firms in their domestic market is 13.7%, while for the Japanese firms this figure is 79.6%. For Europe, the Rest of the world and the United States, this index is 46.2%, 49% and 63.2% respectively. Hence, Japan and the United States appear to be relatively closed economies which is not the case for European countries which are smaller in size, population and total output. As a result, the revenues of European firms are more likely to come from markets outside the home country.

Finally, Table 3.7. exhibits the sampling distribution of the shares of sales performed in the home country according to firms’ size as measured by net sales. The main conclusion is that large multinational firms sell more abroad than smaller firms.

Table 3.7

Sampling distribution (according to firms size as measured by sales) of the shares of sales performed in the home country (d

i

)

small firms ---> large firms

deciles 1 2 3 4 5 6 7 8 9 10

di 81.2 77.2 80.5 73.2 76.3 71.8 68.6 61.7 65.2 57.6

source: sample S625

(19)

Ch. III: LITE database

59

(v) list of variables used in the empirical analysis

Table 3.8 lists all the constructed variables and indicates in which chapter they are used.

Table 3.8

List of constructed variables

a

Variable Description Data Data used in chapter:

name frequency availability

b

4 5 6

aggregated variables

D1

GDP deflator Yearly

15 X X X

D2

Total manufacturing sector added value deflator Yearly

15 X

D3

Industry value added deflators (SIC-2 digits) Yearly

15 (20) X

D4

Weighted manufacturing sector value added deflator Yearly

15 X

DC

Physical capital deflator Yearly

15

ER

Exchange rate 1990

15 X X X

variables at the firm level

Sd1

Net sales deflated by D1 Yearly

2445 X X

Sd2

Net sales deflated by D2 Yearly

2445 X

Sd3

Net sales deflated by D3 Yearly

2445 X

Sd4

Net sales deflated by D4 Yearly

2445 X X

L

Number of employeesc Yearly

2445 X X

C

Net property, plan & equipment deflated by DC Yearly

2445 X X

C2

Stock of physical capital deflated by DC Yearly

537 X

R

Annual R&D expenditures deflated by D1 Yearly

2445 X X X

R/S

R&D intensity (R/Sd1) Yearly

2445 X X

K

Stock of R&Dd Yearly

2445 X X

TUS

Total unweighted R&D spilloverse Yearly

181 X X

TS

Total R&D spilloversf Yearly

625 X

LS

Local R&D spilloversf Yearly

625 X

ES

External R&D spilloversf Yearly

625 X

NS

Domestic R&D spilloversf Yearly

625 X

IS

Foreign R&D spilloversf Yearly

625 X

LNS

Local domestic R&D spilloversf Yearly

625 X

LIS

Local foreign R&D spilloversf Yearly

625 X

EIS

External foreign R&D spilloversf Yearly

625 X

ENS

External domestic R&D spilloversf Yearly

625 X

P

Number of patent applications Yearly

181 X

M

Major industrial activity (23 ISIC constructed industry sectors)g

Last fiscal year

2445 X X X

T

Technological clusters (18 clusters)h Last fiscal year

625 X

G1

Countries Last fiscal year

2445 X X X

G2

Geographic areas Last fiscal year

625 X

MD

Industry dummies Last fiscal year

2445 X X X

TD

Technological dummies Last fiscal year

625 X

GD

Geographic dummies Last fiscal year

2445 X X X

notes: a) all variables (except L) in constant 1990 $,

b) average # of firms for which information is available for each year of the period analyzed, for aggregated data, # of countries (and industry sectors)

c) not corrected for the double counting of R&D personnel,

d) various stocks constructed assuming different depreciation rates: δ=0%, 5%, ..., 100%, e) unweighted R&D spillover variable

f) R&D spillover variable constructed according to Jaffe’s methodology (1986, 1988) g) see Appendix A.3.4,

h) see Section 6.3.3 in Chapter 6

(20)

3.5. COMPOSITION OF SUB-SAMPLES

This section describes three datasets that have been constructed for the purposes of the analyses performed in the subsequent chapters of the thesis. All these datasets are subsamples of the LITE database.

The first dataset, called balanced sample (S625), consists of a balanced panel of 625 firms, i.e. 5000 observations, over the period 1987 up to 1994. This sample is the reference dataset from which most of the findings discussed throughout the thesis have been obtained. It should be mentioned that using a single dataset has the advantage of eliminating sample variability when estimates are performed across model specifications, econometric methods or different constructions of variables. Yet, the main reason why this datasample has been retained is that its balanced feature allows one to perform more sophisticated econometric techniques in order to control for simultaneity issues among the covariates. However, balancing a dataset has the inconvenience of throwing away available information which may lead to some efficiency losses or some biases due to attrition or selection problems. In order to circumvent these issues, a second sample has been constructed as well. This so-called large sample (S2445) consists of an unbalanced panel over the longest possible series length (15 years) and the largest number of firms (2445). This dataset covers the period 1980-1994 and has 13421 observations. Finally, in Chapter 4, the question of the timing of R&D activities and technological spillovers effects on firms’ patenting is addressed. In order to explore this question, a long sample (S181) has been derived. This balanced panel dataset contains yearly R&D and patent information on 181 firms over the period 1983 to 1991.

3.5.1. Cleaning procedure

Due to the presence of outliers, a cleaning procedure has been applied to each of the

three datasets in order to reject firms whose variables displayed very high and often irrelevant

variations. The most likely reason for the presence of such outliers has to be found in the

process of merger and acquisition of firms over time. The cleaning procedure is similar to the

one used by Hall and Mairesse (1995). This procedure is based on the following criteria:

(21)

Ch. III: LITE database

61

• Criterion 1: any observation for which R&D intensity is less than 0.2% or greater than 50% has been removed;

• Criterion 2: any observation for which net sales per worker, capital stock per worker and R&D capital per worker is above or below three times the interdecile range of the median has been removed;

• Criterion 3: any observation for which the growth rate of net sales is less than minus 90%

or greater than 300% or for which the growth rate of labor, capital and R&D stocks is less than minus 60% or greater than 240% has been removed.

Table 3.9 gives the number of observations for each of the three datasets before and after the application of these three cleaning criteria.

Table 3.9

Number (#) of firms removed by application of the cleaning criteria

dataset S625 S2445 S181

a

# of firms before cleaning 673 2676 181

# of firms removed by criterion 1 6 87 -

# of firms removed by criterion 2 15 42 -

# of firms removed by criterion 3 27 103 0

Total # of firms removed 48 232 0

% of initial sample 7.1% 8.7% 0%

# of firms after cleaning 625 2445 181

note: a) for annual R&D expenditures, only criterion 3 has been applied

3.5.2. Balanced sample

The balanced sample (S625) of 625 firms includes information on net sales, number of employees, net plant, property and equipment and annual R&D expenditures, for each firm.

Additional variables have been constructed. They are, the major industry sector of the firms

according to the International Standard Industrial Classification (ISIC) at the two digit level,

the R&D capital of firms constructed following the perpetual inventory method, and the

spillovers stocks constructed according to Jaffe’s methodology (1986, 1988). Finally,

alternative price deflators have been used to deflate all these variables.

(22)

The first column in Table 3.10 gives a picture of the geographical and sectorial composition of the balanced sample. With sixty per cent of firms, the United-States is largely over-represented in the sample. When looking at the sectorial distribution of firms, we observe that the weight of American firms is particularly important in some sectors: computer & office equipment, instruments and software. European firms account for only sixteen percent, while Japanese firms cover twenty one per cent of the sample. The smaller number of European firms retained is mainly due to missing data for the first years covered by the sample. Despite the availability of data for a larger sample of European firms for a shorter period, the panel is built in order to optimize jointly the number of firms as well as the number of periods.

Table 3.10

Sectorial and geographical characteristics of variables - Sample S625 (average over the period 1987-1994)

Number of firms Sales

b

Employ- ment

c

Physical capital

b

R&D capital

b

R&D

b

Spil- lovers

b

R&D intensity

d

EU JP RW

a

US

Aircraft 5 2 0 10 5897 46 1355 1879 302 130589 5.1

Chemicals 15 25 0 37 3034 16 1123 776 127 83415 4.2

Computer 3 6 0 35 5598 32 1498 2275 426 106502 7.6

Construction 2 5 0 2 9003 65 1696 3162 619 78252 6.9

Drugs 13 19 3 19 3474 19 1198 1362 281 114782 8.1

Electrical 1 7 0 15 3993 26 1120 608 113 96669 2.8

Electronics 9 15 2 52 3003 22 798 1108 218 134997 7.3

Fabricated metal 4 6 1 16 2046 12 650 172 34 68370 1.7

Food 5 2 0 12 5564 38 1669 394 74 39326 1.3

Instruments 7 5 0 55 1452 11 430 495 94 109744 6.4

Machinery 9 11 0 38 2211 14 560 358 60 73479 2.7

Motor vehicles 11 6 0 14 15528 83 4156 3318 673 114262 4.3

Paper 1 2 0 16 2330 13 1632 153 27 54551 1.1

Petroleum 4 0 1 11 20614 40 1248 1598 255 106463 1.2

Primary metal 8 11 5 12 3383 15 1843 274 53 84054 1.6

Rubber 2 4 0 6 1500 12 542 227 39 76222 2.6

Software 0 0 0 14 519 3 101 232 62 99824 11.9

Stone 2 4 0 4 2200 16 1048 323 58 85220 2.6

Textiles 0 3 0 4 1364 5 398 100 22 69395 1.6

Wood 0 0 1 6 1533 13 486 87 17 24470 1.1

Average 4198 23 1477 979 187 98618 4.5

Europe 101 7373 53 2404 1959 375 104076 5.1

Japan 133 3444 13 1023 808 157 111562 4.5

Rest of the worldd 13 2113 12 1342 444 90 80712 4.3

United-States 378 3687 20 1394 796 151 93221 4.1

notes: a) Australia and Canada b) millions US dollars 1990 c) in thousands

d) %

(23)

Ch. III: LITE database

63

It should be noted, however, that European firms retained in the sample are relatively large in terms of R&D activities (average R&D expenditures of 375 million $ against 187 million $ for the whole sample). Consequently, as shown in Table 3.11, the representativeness of European firms is not under-estimated.

The last columns in Table 3.10 show the characteristics of the sample. The R&D intensity of industries ranges from 1.1 percent in the wood and paper industries to 11.9 percent in the software industry. Regarding the R&D intensity of the different geographical areas, European and Japanese firms included in the sample appear to be more R&D intensive than US firms.

Table 3.11 shows the representativeness of the sample comparatively to the business enterprise OECD aggregated R&D expenditures of the different geographical areas. These percentages have to be interpreted cautiously since the data comes from different sources.

Nevertheless, the percentages show that despite the fact that only 625 firms are retained in the analysis, they account for around 35 to 55 percent of the R&D outlays realized in the three main geographical areas.

Finally, some descriptive statistics of the variables in S625 are reported in Table 3.12.

On the whole, the correlation coefficients between the net sales, labor, physical and R&D capital are higher than .90 which is quite common with this kind of micro-data.

Table 3.11

Representativeness of S625:

Proportion of Annual National Business Enterprise R&D (OECD, 1996b) realized by firms of the sample (in %)

1987 1988 1989 1990 1991 1992

Europe 40.1 43.0 43.4 44.7 46.2 47.6

Japan 34.8 33.6 33.7 32.2 34.2 37.5

Rest of the world 17.5 16.9 16.6 20.7 18.3 18.3

United-States 46.4 48.8 51.7 52.6 52.8 53.4

(24)

Table 3.12

Descriptive statistics of S625 (logarithm of variables

a

)

C K L R Sd1 Sd2 Sd3 Sd4 TS LS ES NS IS LNS LIS ENS EIS mean

b 12.5 12.0 8.7 10.4 13.8 13.8 13.8 13.8 18.2 16.4 17.8 16.5 17.7 13.8 15.1 16.1 17.3

s.d.

c 2.0 1.9 1.6 1.9 1.8 1.8 1.8 1.8 0.8 1.4 0.8 2.1 0.8 4.5 3.6 2.3 0.9

min

d 3.1 7.4 4.0 4.4 5.3 5.3 5.4 5.2 14.1 12.6 13.7 0.0 14.1 0.0 0.0 0.0 13.0

max

e 18.0 17.5 13.6 15.8 18.7 18.8 18.6 18.9 19.5 18.8 19.3 18.7 19.2 18.0 18.8 18.6 19.2

correlation matrix

C K L R Sd1 Sd2 Sd3 Sd4 TS LS ES NS IS LNS LIS ENS EIS

C

1

K

.82 1

L

.90 .85 1

R

.81 .98 .83 1

Sd1

.96 .86 .94 .85 1

Sd2

.96 .86 .94 .85 1 1

Sd3

.95 .86 .94 .85 1 1 1

Sd4

.96 .86 .94 .85 1 1 1 1

TS

.24 .45 .25 .44 .24 .24 .24 .24 1

LS

.27 .43 .25 .43 .28 .28 .28 .28 .65 1

ES

.18 .34 .18 .32 .17 .17 .18 .18 .85 .24 1

NS

-.05 .09 .01 .09 -.01 .00 .00 .00 .34 .19 .34 1

IS

.30 .45 .26 .44 .28 .28 .29 .29 .93 .62 .78 .08 1

LNS

-.07 .00 -.07 .00 -.05 -.05 -.04 -.04 .15 .30 .06 .73 -.06 1

LIS

.23 .15 .19 .14 .21 .21 .20 .21 .18 .37 -.03 -.13 .35 -.06 1

ENS

-.08 .06 -.02 .05 -.03 -.03 -.03 -.03 .31 .07 .38 .92 .06 .58 -.17 1

EIS

.27 .37 .22 .35 .25 .25 .25 .26 .78 .22 .93 .09 .82 -.14 .09 .13 1 notes: a) see Table 3.8 for a definition

b) mean value c) standard error d) minimum value e) maximum value

3.5.3. Large sample

The second dataset (large sample, S2245) makes the most of the LITE database.

Indeed, this sample consists of an unbalanced panel of 2445 firms over the largest available

cross section of firms and longest possible series length, i.e. 1980-1994 or 15 years. This

sample contains the same variables as the previous one except the spillover variables. Table

3.13 gives a view of the sectorial and geographical composition of the unbalanced sample.

(25)

Ch. III: LITE database

65 Table 3.13

Sectorial and geographical characteristics of variables - Sample S2445 (average over the period 1980-1994)

# of firms Average # of Sales

c

Employ Physical R&D R&D

c

R&D EU JP RW

a

US years

b

ment

d

capital

c

capital

c

intensity

e

Aircraft 16 3 3 25 6.2 2965 24 673 858 163 5.5

Chemicals 59 76 11 66 6 1896 11 696 369 69 3.6

Computer 19 18 3 105 6.1 2089 14 544 830 163 7.8

Construction 12 47 4 3 4.2 1512 8 245 141 28 1.8

Drugs 41 38 7 49 6.6 2220 15 701 817 170 7.7

Electrical 37 27 4 52 5 2148 17 495 516 98 4.6

Electronics 52 54 8 170 5.7 1560 13 369 523 101 6.5

Fabricated metal 33 25 6 33 5.4 882 5 259 71 14 1.6

Food 43 20 7 24 4.9 3365 22 1025 195 37 1.1

Instruments 49 31 3 130 5.8 719 6 190 270 51 7.0

Machinery 78 69 6 99 5.2 1148 9 279 178 32 2.8

Miscellaneous 34 12 2 57 4.1 1840 18 550 234 40 2.2

Motor vehicles 36 20 0 32 6 6791 45 1872 1674 316 4.7

Other manufact. 4 6 1 18 5.7 525 3 62 56 15 2.8

Paper 22 11 3 28 5.3 1311 9 755 84 16 1.2

Petroleum 18 6 5 21 4.5 8617 28 4973 699 119 1.4

Primary metal 25 33 16 28 5.7 2261 13 1044 207 48 2.1

Rubber 22 11 3 20 5.5 852 8 317 105 19 2.2

Software 22 8 5 93 5.2 252 2 53 99 24 9.5

Stone 24 18 2 17 5.5 1359 12 593 176 33 2.4

Textiles 13 16 0 15 5 828 7 255 134 25 3.0

Transport 8 6 2 11 4.7 1827 18 331 278 58 3.2

Wood 7 2 2 15 5.4 667 5 300 38 8 1.2

Average 5.5 1951 14 616 430 84 4.3

Europe 674 5 3231 24 1054 743 147 4.6

Japan 557 5.2 2254 9 575 435 94 4.2

Rest of the world 103 4.5 976 8 520 116 23 2.4

USA 1111 6.1 1616 12 503 357 69 4.3

notes: a) Australia, Brazil, Canada, Chile, Hong Kong, Malaysia, New Zealand, Singapore, South Africa, South Korea, Taiwan

b) average number of years for which information is available c) millions US dollars 1990

d) in thousands e) %

The R&D intensity of industries varies from 1 percent in the wood and paper industries to 12 percent in the software industry. In terms of R&D intensity, US firms are more R&D- intensive than those of other geographic areas. The representativeness of the sample as compared to the business enterprise R&D expenditures in the different geographical areas is more than 65% on average. As always, this percentage has to be interpreted with caution since the data comes from different sources. Furthermore, it is worth recalling that lots of firms have R&D centers in different geographical areas.

Table 3.14

Representativeness of S2445:

Références

Documents relatifs

[r]

On pourrait aussi mettre tout le reste * dans un

[r]

When we consider that some USD 2-3 billion of that amount is spent by professional development organisations just on their own information systems, including their linked open

'LVHDVHVRIRUDO

*ORPHUXODUGLVHDVHUHQDO IDLOXUHDQGRWKHUGLVRUGHUVRI NLGQH\DQGXUHWHU0DODGLHV JORPpUXODLUHVLQVXIILVDQFH UpQDOHHWDXWUHVDIIHFWLRQVGX

[r]

WKMUTVPn§ZI[ˆKQnCrM†MVC:W>QRC:H1C8J_T‘YKI[L†QnC•¬I[PnT‘acW>C_x_E>I?JKMQRC8M†MUdGq:Psg C:T>g