• Aucun résultat trouvé

ROSS ECKLER Bureau of the Census

Dans le document Forum putation (Page 43-46)

Some Elementary Machine Problems in the Sampling WJrk of the Census

A. ROSS ECKLER Bureau of the Census

PER HAP S I have a definite advantage over most of you in being able to recognize the value of this kind of meeting. I do not come here as a mathematiCian nor as an expert in machine accounting; so perhaps I am peculiarly able to see the advantages of bringing together these two types of people. In my opinion the International Business Machines Corporation is to be commended for its vision in making possible meetings of this kind. The advantages for both groups are very great, and I have been much im-pressed with the gains from this sort of meeting even though much of the material is highly technical.

Most of you are familiar with the long-run interest of the Census in large scale accounting equipment. Weare very proud of the fact that in the early years men like Hollerith and Powers were employees of the Bureau of the Census, and we have for many years used equipment specially developed for our needs. We have used that as well as very large quantities of the different types of IBM equipment.

I shall speak primarily of our work in the field of sam-pling, which involves certain applications of equipment somewhat different from what we get in our complete tabulations, and which illustrates some areas in which the present equipment fails to meet the requirements that we would like to see met.

It is unnecessary to inform this group about the ad-vantages of sampling. Most of you are familiar with the theoretical work to a far greater degree than lam. You doubtless know that through the application of sampling we have been able to save very large sums of money in our tabulating work. Moreover, we have been able to speed up results so that we have been able to carry out many types of detailed tabulations which would be far too expensive to carry out on the basis of complete coverage.

There are several directions in which we apply sampling.

One· is the use of sampling to serve as a supplement to a complete census, asking certain questions on a sample basis only. In this way, we have been able to increase very greatly the number of subjects covered.

42

The second way in which we use sampling is to carty out independent field surveys based upon a sample of the population from which we can estimate the total popula-tion of the country and the populapopula-tion in various economic and social groups.

The third way in which we use sampling is in connec-tion with measuring or controlling the quality of statistical operations. I will refer to each of these uses very briefly in some of the applications I will mention.

First of all, I should like to refer to' an application of the machines which is a very happy one. This use is in connec-tion with drawing samples of blocks for certain types of surveys. We want to determine certain blocks in which we are going to collect information. We have put in punched cards certain facts relating to each block in all of aUf cities.

That information, among other subjects, includes the number of dwelling units, the number of stores, and the number of various types of institutions. As we take our population sample, we want· to select certain blocks in which we will do our sampling.

We have determined that under many circumstances an efficient procedure of drawing the sample of blocks is to draw it on the basis of probability proportionate to size, i.e., the number of dwelling units, or number of stores.

VvT e have been able to develop a procedure for selecting blocks by the use of the Type 405 whereby we run through the cards, accumulate, and select every nth dwelling unit.

The machine can be wired so that if in a very large block there are two or more units which are to be included in the sample, this fact will be indicated by the machine. If any of you are interested in that, we would be glad to have you write to inquire about the method.

Another area in which we have made use of sampling is in connection with the processing of data. Weare particu-larly interested in the development of better equipment to handle sample materials because it will give us a possibility of increasing the use of sampling, thereby taking greater advantage of the benefits it offers. We are anxious to ex-tend the use of this tool as far as possible, and in certain

areas the mechanical equipment is a limiting factor. We could go further with it if we had equipment which fitted the needs more precisely than the present equipment does.

Just as we depend upon equipment to expand the use of sampling, we also use sampling to improve the use of equipment. We are carrying out a great many of our processing operations on the basis of sample verification.

This takes place in a great many fields; one example is our foreign trade statistics, which involve tabulations of information on imports and exports by country of origin, country of destination, etc. vVe have developed a system of sample verification, which usually provides for a sample of one card in fifty. We continue with that sample as long as the operator is making fewer than sixteen errors per 400 cards sample verified. When she exceeds that rate of err:or, we shift over to 100 per cent verification for a short time. Then, when the evidence is available to show that the person has come back down to a lower error rate, we shift back to a five per cent sample and after a period of that, considerable saving in the verificat~on operation, and still provides control of the work so that we can be sure that statistics for retail and service firms, and generally similar work in government statistics, where we collect employ-ment data for state and local governemploy-ment units.

In the first field I mentioned, our current population surveys, we interview a sample of about 25,000 households once a month. vVe get information from them on the number of people who are employed, the number unem-ployed, the hours worked, the occupation, industry, and so forth. The households are selected by the use of area sampling, a method probably familiar to most of the people in this room. It is based upon units which are selected from sixty-eight different sample areas scattered around the country, scientifically determined so as to give a good cross section of the country as a whole. We insist that all of our samples have measurable accuracy; in other words, that the design be such that we can determine the degree of error in the results.

In this current survey of 25,000 households we estimate that the figure on the total labor force will be within one per cent of that which would be obtained from a complete census nineteen times out of twenty. vVe achieve that very

high degree of accuracy partly by virtue of the fact we have control totals for various groups to which we can adjust the sample results. Obviously, in a sample survey of this sort giving monthly information, speed is of great-est importance. These data are highly perishable and it is important we make them available as rapidly as possible because they are widely used. The information we get for these 25,000 households is punched in about 65,000 cards for individuals and those cards are weighted according to the sampling ratios that were used. As each card gets a each type of card. Even after considerable experience, we found ourselves unable to do the whole job in less than

This procedure has certain disadvantages, however; in order to use this machine we are forced to use a less pre-cise weighting system than we could use on the 405. We accomplish the weighting by classifying our cards in 144 different groups and then rejecting some cards by random methods and duplicating some others by random methods so as to get our results weighted according to the

One other area of work I will mention briefly in closing.

In connection with our sample work, we attempt to estab-lish a very careful measure of the degree of accuracy of the results so that we have to compute large numbers of variances and, as you well know, that involves calculating very large numbers of sums of squares and sums of prod-ucts. In fact, for the measurement of accuracy of just one item, it is necessary to get the squares of more than three /thousand numbers, to weight them by certain factors, and then to combine them. It is possible to get these sums of squares and Stul1S of prodl1cts through a rather

compli-44

cated series of operations, but the time required for that is considerable. It is not a .very efficient procedure. Some consideration has been given to the extent to whiCh some of the new high-speed multipliers will meet the problem, but study of the situation so far indicates that we are still considerabl y handicapped in the direction of getting these measures of the accuracy of the results of sample data.

There is a need for further development which will in-crease the use of sampling by making it possible to meas-ure the accuracy of the results more speedily.

I have brought several types of exhibits in which some of you may be interested, a number of pamphlets and bulletins which show cases in which we have made direct use of machine tabulation sheets for publication: some of our housing reports, some of our foreign trade reports,

SCIENTIFIC COMPUTATION and a job we did for the Air Forces, on all of which we printed the 405 sheets directly. If there are questions you want to ask about them, feel free to write in. There are also several copies of the booklet on work of the Census Bureau-Fad-Finder for the Nation. If any of you want a copy of that, I would be very glad to furnish it.

DISCUSSION

Mr. Tillitt: At one time the Bureau of the Census turned out a little sheet called "Tab Tips." Is that dis-tributed any more?

Mr. Eckler: I think I have the first forty. If anything of that sort is being distributed now, it has not come to my attention.

CUT H B E R T C. HURD

Dans le document Forum putation (Page 43-46)