• Aucun résultat trouvé

Data processing : working group on organizaiton, content and methodology of household surveys

N/A
N/A
Protected

Academic year: 2022

Partager "Data processing : working group on organizaiton, content and methodology of household surveys"

Copied!
7
0
0

Texte intégral

(1)

LIMITED

E/CN.14/SM/26

UNITED NATIONS ■ x? September 1979

ECONOMIC AND SOCIAL COUNCIL Ori*inal: ENGLISH

ECONOMIC COMMISSION FOR AFRICA

Working Group on Organization, Content and Methodology of Household Surveys Addis Ababa, 15-19 October 1979

DATA PROCESSING

CONTENTS

Paragraphs

INTRODUCTION • • 1 - J

MANUAL PROCESSING . . .. ■ 4-9

DATA PREPARATION . • • lC " 12

DATA ENTRY 13

MACHINE EDITING ^

MACHINE TABULATIONS 15-20

QUALITY CONTROL 21 - 23

STATISTICAL ANALYSIS OF DATA . 24-25

CONCLUSION . .

Introduction

1. The processing of data collected in household surveys constitutes an important feature of the survey plan. Not only must the survey calendar make adequate provision for all the elements of the data Processing stage but the survey plan must take into account the requirements of data

processing. For example, the questionnaire format should take account of

?he mode of transforming the data collected in the field into machine

readable form. If it is intended to key the data directly from the

questionnaires, then the design of the questionnaire should be such as to make it possible for the operators to key the information directly from it. Otherwise, an intermediate step of coding the information onto coding

sheets may be necessary. It should however be stated that this intermediate

stage introduces another component of error, transcription errors, into the survey process and should be avoided if possible.

^79-2601

(2)

Page 2

2. In this paper, the various stages of the usual data processing

operation w:\ll be considered, namely manual processing (i.e., manual editing, coding and ;:^.: tabulations), data preparation, aata entry, machine editing, and machine tabulation. In addition, such aspects as quality control of data processing and operational support to analysis

will be discussedo

3. In preparing the survey budget, all these elements of cost should be

taken into account,

Manual Processing

4. It has now become a recognised fact that some clerical editing^of data should start before data preparation. Three methods of achieving this end in African su-eys have been used: editing in the field fol lowed by imine "iate reference to households where necessary, editing by specially recruited clerks in the office with or without any reference to the field and - combination of the first two approaches. There are many ways of editing questionnaires in the field. The interviewer at the end of each"working day can screen questionnaxres completed by him for missing, inconsistent and inadmissible entries. His supervisor can also check a random sample of*his questionnaires for these errors.

Finally, a specially trained editor attached to the team would check all

questionnaire, f, r errors/ All reparable errors, i.e., errors whicn can be corrected by the interviewer without returning to the households, will then bo corrected by him. But all irreparable errors will have to be corrected by revisiting the households. The main advantage of field editing is L-h-t it is done while "the interviewer is still in tap field and still remembers pare., of the interviews. Actual reintervxews are also inexpensive because the distances from the- households to be re-^

interviewed are short. The second approach requires that all question naires be . ant to a central place r in a Large country, to numoer ox

centres to be edirea. mere i* usually a considerable time lag between interview a::d editing and 'this rules out effectively any reference to the fieM -f— -n—cL-ions- The main advantage, however, is that by

i i ibl *s^™* ^

the fieM f— -n—cLions- The ma g,

concentrating editors in one place, it is possible

d t ifrm standards of editing than

^ ^

concentrating editors in one p, p

work more effectively and to ensure uniform standards

would otherwise be the case if a large number of field editors, was ^ bein. used. The third approach of using both field and office editors combines the advantages of both methods but does not appear from

experience to be cos? effective, if machine editing is .-ntemplated.

In certain countries., the office editing is combined witft coding m

order to reduce costs.

K. With respect to coding, there seems to be now a bias towards

pre-coded boxes for closed^and semi-closed questions, This cuts down

SEn£rJU =i -*»hence the cost of data processi"s.

SEn£r;JU - --■ =°<=ir-e -*»hence the cost o to the case where pre-coding is used extensively, one coder can be p

expected to code one questionnaire completely. However, wnen th is no?"he case, there are two distinct schools of thought on what snould

be done fee viewpoint is that coders should specialise on aifferent

section; of1h, auL iionnaires. The other school of thought maintains tnat Tf the sar,,3 "coder codes each questionnaire throughout he is aole

to discover obvious inconsistencies and correct tnem.

(3)

E/CN.U/SM/26

Page 3

^ Hie ne::t question to be considered in relation to the ABSCP is whether some, or all the interviewers should be used as coders or a ^ special team of coders should be recruited. It has been argued that interviewers by their training and experience can become good coders and that in the context of the AHSCP which envisages a permanent ,ield staff one of the tasks of the field staff should be coding. This woul-u ensure that especially in those years where single visit surveys are con-^u-ted tlv» field staff would be effectively occupied throughout the year. On the other hand, it has been suggested that by using interview ers as coders especially in inquiries like income, consumption and expenditure surveys which usually cover 12 months coding will oe delayed. The use of a special team of coders also ensures that

independent checks of the work of interviewers can be carried out and that there are no residual effects of interviewers' bias and training defects carried over to the coding stage, especially; when coc-ers are also expected to do editing. In the context of the AHSCP, cost con siderations would appear to rale out the use of a special team of coders. For most surveys, coding can follow quickly after interview ing. For the several visits surveys, e.g.. household income,

consumption and expenditure surveys (HICES) and labour force surveys nodellec1 along the line of the U.S. Current Population Survey (CES;, some solutions would have to be found. One suggestion relating to

the HICES is a month's break after every two months1 interviewing.

However, there are operational and technical objections to this

approach, A more practical approach is to use a completely pre-codea questionnaire which can then be transmitted f.r coding after the normal checks by the interviewer, supervisor and field editor.

7 If coding is however used as a distinctive stage in the data processing nhase, it is important to establish production standards, i.e., number of questionnaires (or persons or items) to be codec per day and number of permissible errors. The question of permissible errors is discussed again under the section on Quality Control.

8. Before coding begins, it is necessary to prepare all coding lists and instructions. It is especially important to indicate to coders how to treat "not stated" and "not applicable". Failure to go this in a number of African surveys, has resulted in the two categories

being grouped together and has presented problems to the data analysts.

9 Because of the time lag between interviewing and completion of machine processing, it has now become almost part of conventional wisdom to prepare manual summaries of the more important survey results and to use these preliminary results, pending the release of the final tabulation figures. In at■least three countries where this suggestion was made but not followed, it has become clear that blind faith in the computer yielding fast results has proved to be a

mirage. Even in large-scale operations like population censuses,

these manual totals have proved useful as interim results. It is therefore recommended that survey organizers should make provision for manual summaries of the more important results especially in surveys where the design is self-weighting. For the field cnecking of HICES records, the use of sections in the questionnaires and totals and balances helps not only to ensure accuratercturns out also

to provide the required preliminary summaries.

(4)

Page 4

Data Preparation

10, After coding, the next stage is the conversion of the data into a machine readable formo There are three main options in Africa for doing this: punch cards, key to tape or key to diskette. Other

techniques such as direct data entry, mark sensing and optical readers have been used but are not popular»

11. For many countries, the choice of data preparation equipment to be used for the survey is already pre-detenr.ined by the existing instal lations. For countries establishing neT-: installations or about to change existing ones, expert advice is necessary. Although the key punch is becoming almost obsolete, it is advisable to compare any suggested equipment T-;ith the performance of the keypunch with respect to both time and cost. If desired, the services of the ECA Regional Adviser in Data Processing or one of the UN Technical Advisers in Computer Methods can be made available to assist countries to make appropriate choices.

12C Whatever choice is made, the cost implications of the data pre paration element of the total survey budget should be made clear from the beginning. In one country, for the HICES it Has decided to key the data for each week, instead of the monthly summary. This has been estimated to require about 104 million key depressions or approximately 2 million punch cards. At the observed rate of key depressions per hour by operator in that country, it is estimated that this would take 20 key operators about ?60 days to punch and verify the data. Since the weekly information is not required for any analysis, it would seem more economical to use monthly summaries which can be obtained easily by using calculators and keying the data. It is reckoned that such faith in computer processing has net only delayed the results of

surveys in a number of countries but in some the results were not even published•

Data Entry

13. When direct data entry or key to tape or similar facilities do not exist in the country, the next stage of the data processing phase

is the entering of the keyed data onto the computer. This can be done by a card-to-tape operation, in the case of a punch card installation or in the case of key to diskette equipment by the collation of dis kettes onto tapes.

Machine Editing

14. Some years ago, this aspect of data processing was the most difficult, Most programmes for editing had to be custom built and in certain cases defects in edit specifications by subject-matter

specialists resulted in generally unsatisfactory results. In recent years, there have been attempts to develop soft-ware packages mainly for censuses and demographic surveys but these can also be applied to the subjects covered under the AHSCP. CONCOR developed by Celade for use in demographic surveys was adapted for use in the World Fertility Survey (WFS) and has been adapted for general editing in censuses hy the U.S. Bureau of the Census. Some of the countries which used CONCOR

(5)

E/CW.lA/SM/26

Page 5

in the WFS had problems, suggesting that the package is not yet easy to use. In recent times, the UN Statistical Office has developea UNEDIT for use in censuses. Because of the nature of the subjects covered in the AKSCP, the package can be adapted for use in these

surveys. However, UHEDIT is currently written only in RPG I*J^

limits its usefulness. For many countries participating in the recourse would have to be made to custom built programmes, which

seem to indicate the necessity for the availability of a high level of programming expertise in the statistical office.

Machine Tabulations

15 In the field of machine tabulations, the position is more satisfactory. There are at least five soft-ware packages which are being or have been used for censuses and surveys in the region recently.

These ares CENTS, COCENTS, TPL, TAB 6Z and XTALLY.

15. In this paper, no attempt Trill be made to spell out the relative merits of each package. A project undertaken by the International Association of Survey Computers (lASC) seeks to provide the relevant information. In terms of popularity, however, COCENTS appear to have been used by more African countries than any other soft-ware package.

17. In evaluating soft-ware packages for use in a particular country, the following considerations should be taken into accounts compatibility of soft-ware with existing computer hard-ware, operation time an- other

capabilities of package,

1C. Even where soft-ware packages have been available e^erience^as

shown that at least three programmers need to be assiged to the survey

pro—e. The number is detained not only by the great variety of

?IST^ired but also by the high turnover of high level data process- stafT in this region. In at least one country work on a survey

?hl itt f a number of national counterpart staff x

projects ?hl apoointnent of a number of national

essential.

19 Most of the countries participating in the first phase of the

(6)

B/CN.U/SM/26 Page 6

20 The time and cost of. the tabulation stage of the data processing phase have to be carefully uorked out during the preparatory stages of the survey. If advance tabulations are required, this shoulc oe planned

from the beginning.

Quality Control

21. As is well known, the primary objective.of the AHSCP is to assist countries to collect, process, analyse* publish and disseminate

integrated data on the demographic, social and economic characteristics of households and household meubers. At each stage of this undertaking, error is likely to creep in and the purpose of a good quality control .

(0 C) system is to minimise the error in the final product, the puolisnec.

tables The Q.C. system should therefore allocate its given resources among the different phases of the undertaking in order to achieve the desired objective. This concept of a Q.C. cystem is largely lost m African surveys where the general vier seems to be that 100 per cent verification at every stage ensures 100 per cent accuracy m the xinal product. However, the results in published tables show how false this

assumption is«

22.. For the data processing phase of the survey operation, the required resources have to be provided for the implementation of a Q.C. system.

Also acceptable standards of performance by interviewers, coders and .«y operators as to quantity and quality of output have to be defined as well as actions to be taken when, these standards are not net. Ine

standards set should be realistic

ZX In this oaper therefore a plea is being made for African countries to use modern Q.C. systems in their data processing as well as m tne

Xie.3 of the survey operation. Data *hioh have not been subjected

to proper controls and checks are usually not worth using.

Statistical Analysis of Data

2i. In document E/CN.H/SK/27, the subject of data evaluation and analysis is dealt with. In this paper consideration is being given only to the data processing aspects of the subject. The computer has provea an invaluable aid to data analysts. Regression, multivaria.e an

^iFiilar analysis T-rhich used to take considerable time to complete can

no^be accomplished in a short period with the aid of the eleegmxe

corovter. In addition there are soft-ware packages sucn as SPSS (Soft

ware feckage for the Social Sciences) and other university-oasea

;"ckajs which nake these calculations routine even for the analysts

unfamiliar with any of the sophisticated computer languages.

?<=. The rain point which has to be borne in nincl here is that the

advent^f the electronic computer and tne easy availability of suitable

solt-ware packages has enlarged the scope of poosxole analytical work

that can be done in connexion with the AHSCP.

(7)

E/CN.H/SM/26 Rage 7

Conclusion

25. In this paper consideration has been given to the relevant aspects of data processing as far as they affect (or are likely to affect) the AHSCP. It has not been possible to deal with each component element of the topic in depth but the main message being given is that data

processing, both manual and machine, is an important phase of the survey plan and satiofactory financial and personnel provisions should be made for it if the whole survey is not to get bogged down at that

phase„

Références

Documents relatifs

The 470V/7 performs a machine-check extended logout (MCEL) when a machine check occurs and the mask bits of control register 14 are set to allow. MACHINE-CHECK

The Binary Input Program handles an fb tape, whereas the Conversion Program or the Generalized Post-Mortem Program (both Drum Utility Programs) are brought in by the

Remote computer systems have been used for applica- tions requiring powerful local processing and trans- mission of files to a host IBM system, for applications

Printing with the Word Processor can be direct to a printer attached to your workstation or spooled to a printer shared by one or more workstations in a

ABSOLUTE MAXIMUM RATINGS (Operation beyond the limits set forth in this table may impair the useful life of the device. For conditions shown as MIN or MAX,

The Signetics Programmable Logic family consists of devices which are designed to address logic needs ranging from random gates in the case of the Field Programma- ble Gate Arrays,

(Region: Configuration, Index: 0x2C, Access: Write Once, Default: 0x0000) The subsystem vendor ID register is used to identify the vendor of the add-in board on which the TVP4010

Turned ON during hold trigger time when CRCB-3 breaks for a typewriter operation or PCB-I makes for a paper tape punch operation, to signal the computer to proceed