Data Sets - OS PL/I

In IBM system/360 Operating system, a data set is any collection of data that can be created by a program and accessed by the same or another program. A data set may be a deck of punched cards; i t may be a series of items recorded on magnetic tape or paper tape; or it may be recorded on a direct-access device such as a magnetic disk or drum. A printed listing produced by a program is a data set, but i t cannot be accessed by a program.

A data set resides on one or more volumes. A volume is a standard physical unit of auxiliary storage (for example, a reel of magnetic tape or a disk pack) that can be written on or read by an

input/output device; a serial number identifies each volume <other than a punched-card or paper-tape volume or a magnetic-tape volume either without labels or with nonstandard labels).

A magnetic-tape or direct-access volume can contain more than one data set;

conversely, a single data set can span two or more magnetic-tape or direct-access volumes.

OAT A S Ell' NAMES

A data set on a direct-access device must have a name so that the operating system can refer to it. If you do not supply a name, the operating system will supply a temporary one. A data set on a magnetic-tape device must have a name if the magnetic-tape has standard labels (see "Labels,n later in this chapter.) A name consists of up to eight characters, the first of which must be alphabetic. Data sets on punched cards, paper tape, unlabeled magnetic tape, or

Chapter 6: Data Sets and Files

nonstandard unlabeled rragnetic tape do net have nan:es.

You can place the name of a data set., with information identifying the volume on which i t resides, in a catalog that exists on the volume containing the ^o~erating systero. Such a 'data set is termed a

cataloged data set. To cat~lcg a data set use the CATLG subpararreter of the DISP parameter of the DD statement. To retrieve a cataloged data set, you do net need to give the volume serial nurrber or identify the type of device; you need only specify the name Of the data set and its

disposition. The operating systerr searches the catalog for information associated with the name and uses this infornation te request the operator to nount the velume containing your data set.

If you have a set of related data sets, you can increase the efficiency of the search for a particular data set by

establishing a hierarchy of indexes in the catalog. For example., censider an

installation that groups its data sets under four headings: ENGRNG, SCIENCE, ACCNTS, and INVNTRY, as shown in Figure

6-1. In turn, each of these groups is subdivided; for example, the SCIENCE grou~

has subgroups called PHYSICS, CHEM, MATH, and BIOLOGY. The MATH SUbgroup itself contains three subgroups: ALGEBRA, CALCULUS, and BOOL.

r---,

I I I I

ENGRNG SCIENCE ACCNTS INVNTRY

I I

r---, I

I I I I

PHYSICS CHEM MATH BIOLOGY

I I

r---, I

I I I

ALGEBRA CALCULUS BaaL Figure 6-1. A hierarchy of indexes

To find the data set BOOL, the names of all the indexes of which i t is pa.rt must be specified, beginning with the largest group SCIENCE, followed by the subgroup name MATH and finally the data set name BOOL. The names are separated by periods. ~he

complete identification needed to find the data set BOOL is

SCIENCE.MATH.BOOL

Such an identifier is termed a qualified name. The maximum length of a qualified name is 44 characters, including the

separating periods; each component name has a maximum length of eight characters. (Do statement that includes the keywords VTOC and SYS.) unique parenthesized generation number Cfor example, STOCKCO), STOCKC-l), STOCKC-2».

The most recently cataloged data set is generation 0, and the preceding generations are -1, -2, and so on. You specify the number of generations to be saved when you est.ablish the generation data group.

For example, consider a generation data group that contains a series of data sets used for weather reporting and forecasting;

the name of the data sets is WEATHER. T~e language and utilities publications.

BLOCKS AND RECORDS

The items of data in a data set are

arranged in clocks separated by interblock gaps (lEG) ^1.. input/output operations required to process a data set. Records are bleeked and

deblocked automatical+y by the data management routines.

I Specify the record length in the LRECL I parameter of the DD statement or in the IRECSIZE option of the ENVIRONMENT

I attribute. .

Data Codes: The normal code in which data is recorded in system/360 is the Extended Binary Coded Decimal Interchange Cede

(EBCDIC) although source input can

optionally be coded in~CD (Binary Coded Decimal). ^Howev~r,for nagnetiQ taFe only, System/360 will accept data.recorded in the American Standard Code for Inforrratien'

Interchange (ASCII) •. Use the' ASCII and BUFOFF options of the ENVIRONMENT attribute if you are reading or writ.,ing data sets options used :eor "ASCIi:. data -sets "see the language reference manual~lorthis

<:'~~Il~!~~'!..

___________ - "

1. Although the term "interreco"rdgaJ;:" is widely used in operating- system .manualsr, i t is not used here; it. nas been replaced by the more accurate term "interbloek gap."

74 OS PL/I Optimizing Compiler: Programmer's Guide

RECORD FORMATS

The records in a data set must have one of the following formats:

•

F fixed length

•

V variable length (D- or V-format)

•

U undefined length

All formats can be blocked i f required, but only fixed-length and variable-length records are deblocked automatically by the system; undefined length records rrust be deblocked by your program.

Fixed-length Records (F-format Records) You can specify the following formats for

fixed-length records:

F Fixed-length, unblocked FB Fixed-length, blocked

FS Fixed-length, unblocked, standard FBS Fixed-length, blocked, standard

In a data set with fixed-length records, as shown in Figure 6-2, all records have the same length. If the records are blocked, each block contains an equal number of fixed-length records (although the last block may be truncated if there are insufficient records to fill it). If the records are unblocked, each record constitutes a block.

Unblocked records (F-format):

r---, r---, r---, r--I Record r--Ir--IBGr--I Record r--Ir--IBGr--I Record r--Ir--IBGr--I

L---J L---J L---J

L--Blocked records (FB-format):

r---Block---,

r---,

_I _Record _Record Record I IBGI Record

r---L---J

L---Figure 6-2. Fixed-length records Because i t can base blocking and

deblocking on a constant record length, the operating system can process fixed-length

records faster than variable-length

records. The use of "standard" (FS-format and FBS-format) records further optimizes the sequential processing of a data set on a direct-access device. A standard format data set must contain fixed-length records and must have no embedded empty tracks or short blocks (apart frorr the last blcck).

With a standard format data set, the operating systerr can ^~redictwhether the next block of data will be on a new track and, if necessary, can select a new

read/write head in anticipation of the transmission of that block. A PL/I Frograrr never places embedded short blocks in a data set with fixed-length reccrds. A data set containing fixed-length records can be processed as a standard data set even if it is not created as such, providing i t

contains no embedded short blocks cr empty tracks.

Variable-length Records (D- cr V-forrrat Records)

You can specify the follcwing forrrats for variable-length records:

V Variable-length, unblocked VB Variable-length, blocked

VS Variable-length, unblocked, spanned VBS Variable-length, blccked/, sFanned D Variable-length, unblocked~ ASCII DB Variable-length, blccked, ASCII

V-format perrr:i ts both variable-length records and variable-length blocks. The first four bytes of each reccrd and cf each block ccntain control information for use by the operating systerr (including the length in bytes of the record or block).

Because of these control fields, variable-length records cannot be read backwards.

Illustrations of variable-Ier.gth reccrds are shown in Figure 6-3.

V-format signifies unblocked variable-length records. Each record is treated as a block containing only one record, the first four bytes of the block contain block control information, and the next fcur contain record control information.

VB-format signifies blocked variable-length records. Each block contains as many complete records as i t can

accommodate. The first four bytes of the block contain block control in format ion/, and the first four bytes of each record contain record control information.

Spanned Records: A spanned record is a variable-length record in which the length of the record can exceed the size of a block. If this occurs, the record is

V-format:

Record 2

VB-format:

~_C_2~1

^____R_e_co_r_d_1 __

~I_C_2~1

^____R_e_c_or_d_2 __

~IBG I~C_1~I_C_2~1

^___R_e_co_r_d_3 __ _

VS-format:

spanned record

VBS-format:

Record 1 (entire)

Record 2 (first segment)

spanned record

Record 2 (last segment)

IBG

Record 3 IciTC1 C2T2 Record 1 Record 2 ] IBG C1 C2

I I I

^(entire) (first segment)

~~~~---- ---~~---

~--~~---~--~---C1: Block control information

C2: Record or segment control in.formation

Figure 6-3. Variable-length records divided into segments and accommodated in two or more consecutive blocks by

specifying the record format as either VS or VBS. segmentation and reassembly are handled automatically. The use of spanned records allows you to select a block size, independently of record length, that will combine optimum use of auxiliary storage with maximum efficiency of transmission.

VS-format is similar to V-format. Each block contains only one record or segment of a record. The first four ~tes of the block contain block control inforrration, and the next four contain record or segment control information (including an

indication of whether the record is complete or is a first, intermediate, or last segment).

With REGIONAL(3) organization, the use of VS-format removes the limitations on block size imposed by the physical characteristics of the direct-access device. If the record length exceeds the size of a track, or if there is nc room

left on the current track for the record, the record will be spanned over one or more tracks.

VBS-format differs from VS-forrrat in that each block contains as many complete records or segments as i t can acccrrrrcdate;

each block is, therefore, approximately the same size (although there can be a

variation of up to four bytes" since each segment must contain at least one byte of data> •

ASCII Records: For data sets that are recorded in ASCII' use D-format as follows:

D-format records are similar to V-format except that the data they contain is

recorded in ASCII.

DB-format records are sirrilar to VB-format except that the data they contain is recorded in ASCII.

Undefined-length Records CU-format Records>

U-format permits the processing of records that do not conform to F- and V-fcrrrats .•

The o~erating system and the compiler treat each block as a record; your progran must

76 OS PL/I Optimizing Compiler: Programmer's Guide

perform any required blocking or deblocking.

DATA SET ORGANIZATION

The data management routines of the

corresponding keywords describing their PL/I organization¹ are given in Figure 6-4. corresponding PL/I organization.

In a sequential (or CONSECUTIVE) data direct-access devices. Paper tape, punched cards, and printed output are sequentially organized.

An indexed seguential (or INDEXED) data set must reside on a direct-access volUme.

Records are arranged in collating sequence, according to a key that is associated with every record. An index or set of indexes maintained by the operating system gives the location of certain principal records.

This permits direct retrieval, replacement, addition, and deletion of records, as well as sequential processing.

1 Do not confuse the terms "sequential" and

"direct" with the PL/I file attributes SEQUENTIAL and DIRECT. The attributes refer to how the file is to be processed, for REGIONAL(2) and REGIONAL(3), identifies the record, permits direct access te any record; sequential processing is also possible.

A teleprocessing data set (associated with a TRANSIENT file in a PL/I program) groups of sequentially organized data, each called a member, reside on a direct-access volume. The data set includes a directory that lists the location of each member.

Partitioned data sets are often called libraries. The compiler includes no special facilities for creating and accessing partitioned data sets; however"

this is not necessary since each rrerrter can be processed as a CONSECUTIVE data set by a PL/I program, and there is ready access to the operating system facilities for

partitioned data sets through job centrol language. The use of partitioned data sets as libraries is described in Chapter 10.

LABELS

The operating system uses labels to identify magnetic-tape and direct-access volumes and the data sets they contain, and to store data set attributes (for example, record length and block size). The

attribute information must originally come from a DD statement or from your program.

Once the label is written you need net specify the information again.

Magnetic-tape volumes can have standard or nonstandard labels, or they can be labels contain systerr inforrration, device-dependent information (for example,

recording technique), and data set

characteristics. Trailer latels are almost identical with header labels, and are used when magnetic tape is read backwards.

Direct-access volumes have standard

labels. Each volume is identified by a

Dans le document OS PL/I (Page 86-91)