Raster scanning 8 - Digital Video and HD

I introduced the pixel array on page 3. This chapter outlines the basics of this process of raster scanning, whereby the samples of the pixel array are sequenced uniformly in time to form scan lines, which are in turn sequenced in time throughout each frame interval. In Chapter 13, Introduction to component SD, on

page 129, I will present details on scanning in conven-tional “525-line” and “625-line” video. In Introduction to composite NTSC and PAL, on page 135, I will intro-duce the colour coding used in these systems. In Chapter 15, Introduction to HD, on page 141, I will introduce scanning in high-definition television.

Flicker, refresh rate, and frame rate

A sequence of still pictures, captured and displayed at a sufficiently high rate, can create the illusion of motion.

The historical CRT display used for television emits light for a small fraction of the frame time: The display has ashort duty cycle; it is black most of the time. If the flash rate– or refresh rate– is too low, flicker is

perceived. The flicker sensitivity of vision is dependent upon display and viewing conditions: The brighter the environment, and the larger the angle subtended by the picture, the higher the flash rate must be to avoid flicker. Because picture angle influences flicker, flicker depends upon viewing distance.

Most modern displays – including LCDs and plasma displays – do not flash, and cannot flicker. Nonetheless, they may be subject to various motion impairments.

In a “flashing” display, the brightness of the displayed image itself influences the flicker threshold to some

Flicker is sometimes redundantly called large-area flicker. Take care to distinguish flicker, described here, from twitter, to be described on page 89. See Fukuda, Tadahiko (1987), “Some Characteristics of Peripheral Vision,” NHK Tech. Mono-graph No. 36 (Tokyo: NHK Science and Technical Research Laboratories).

extent, so the brighter the image the higher the refresh rate must be. In a very dark environment, such as the cinema, flicker sensitivity is completely determined by the luminance of the image itself. Peripheral vision has higher temporal sensitivity than central (foveal) vision, so the flicker threshold increases to some extent with wider viewing angles. Table 8.1 summarizes refresh rates used in film, video, and computing.

In the darkness of a cinema, a flash rate of 48 Hz is sufficient to overcome flicker. In the early days of motion pictures, 24 frames per second were found to be sufficient for good motion portrayal. So, a conven-tional film projector uses a dual-bladed shutter,

depicted in Figure 8.1, to flash each frame twice. Higher realism can be obtained with material at 48 frames per second or higher displayed with single-bladed shutters, but such schemes are nonstandard.

Television refresh rates were originally chosen to match the local AC power line frequency.

See Frame, field, line, and sample rates, on page 389.

In the dim viewing environment typical of television, such as a living room, a flash rate of 60 Hz suffices. The interlace technique, to be described on page 88, provides for video a function comparable to the dual-bladed shutter of a film projector: Each frame is flashed as two fields. Refresh is established by the field rate (twice the frame rate). For a given data rate, interlace doubles the apparent flash rate, and provides improved motion portrayal by doubling the temporal sampling rate. Scanning without interlace is called progressive.

Farrell, Joyce E., et al. (1987),

“Predicting flicker thresholds for video display terminals,” in Proc.

Society for Information Display 28 (4): 449–453.

CRT computer displays used in office environments historically required refresh rates above 70 Hz to over-come flicker (see Farrell, 1987). CRTs have now been supplanted by LCD displays, which don’t flicker. High refresh rates are no longer needed to avoid flicker.

The fovea has a diameter of about 1.5 mm, and subtends a visual angle of about 5°.

Office 320 nt “Average”

~20%

various, e.g., 66, 72, 76

same as refresh rate Table 8.1 Refresh rate refers to the shortest interval over which the entire picture is updated. Flash rate refers to the rate at which the picture height is covered at the display. Different refresh rates and flash rates are used in different applications.

Figure 8.1 A dual-bladed shutter in a film projector flashes each frame twice. Rarely, three bladed shutters are used;

they flash each frame thrice.

Introduction to scanning

A moment ago, I outlined how refresh rate for television was chosen so as to minimize flicker. In Viewing distance and angle, on page 100, I will outline how spatial sampling determines the number of pixels in the image array. Video scanning represents pixels in sequential order, so as to acquire, convey, process, or display every pixel during the fixed time interval associated with each frame. In analog video, information in the image plane was scanned left to right at a uniform rate during a fixed, short interval of time – the active line time.

Scanning established a fixed relationship between a position in the image and a time instant in the signal.

Successive lines were scanned at a uniform rate from the top to the bottom of the image, so there was also a fixed relationship between vertical position and time.

The stationary pattern of parallel scanning lines disposed across the image area is the raster.

Line is a heavily overloaded term.

Lines may refer to the total number of raster lines: Figure 8.2 shows

“525-line” video, which has 525 total lines. Line may refer to a line containing picture, or to the total number of lines containing picture – in this example, 480. Line may denote the AC power line, whose frequency is very closely related to vertical scanning. Finally, lines is a measure of resolution, to be described in Resolution, on page 97.

Samples of a digital image matrix are usually

conveyed in the same order that the image was histori-cally conveyed in analog video: first the top image row (left to right), then successive rows. Scan line is an old-fashioned term; the term image row is now preferred.

Successive pixels lie in image columns.

In cameras and displays, a certain time interval is consumed in advancing the scanning operation– histor-ically, horizontal retracing– from one line to the next;

several line times are consumed by vertical retrace, from the bottom of one scan to the top of the next. A CRT’s electron gun had to be switched off (blanked) during these intervals, so these intervals were (and are) called blanking intervals. The horizontal blanking interval occurs between scan lines; the vertical blanking interval (VBI) occurs between frames (or fields). Figure 8.2 shows the blanking intervals of “525-line” video.

The word raster is derived from the Greek word rustum (rake), owing to the resemblance of a raster to the pattern left on newly raked earth.

VERTICAL

Figure 8.2 Blanking intervals for “525-line”video are indi-cated here by a dark region surrounding a light-shaded rectangle that represents the picture. The vertical blanking interval (VBI) consumes about 8% of each field time; hori-zontal blanking consumes about 15% of each line time.

The horizontal and vertical blanking intervals required for a CRT were large fractions of the line time and frame time: In SD, vertical blanking consumes roughly 8% of each frame period. In HD, that fraction is reduced to about 4%.

In a video interface, whether analog or digital, synchronization information (sync) is conveyed during the blanking intervals. In principle, a digital video inter-face transmit just the active pixels accompanied by the minimum necessary sync information. Instead, digital video interfaces use interface rates that match the blanking intervals of historical analog equipment. What would otherwise be excess data capacity is put to good use conveying audio signals, captions, test signals, error detection or correction information, or other data or metadata.

Scanning parameters

In progressive (or sequential) scanning, the image rows are scanned in order, from top to bottom, at a picture rate sufficient to portray motion. Figure 8.3 at the top of the facing page indicates four basic scanning parameters:

• Total lines (L_T) comprises all of the scan lines, including the vertical blanking interval and the picture lines.

• The image has N_R image rows, containing the picture.

(Historically, this was the active line count, L_A.) • Samples per total line (S_TL) comprises the sample intervals in the total line, including horizontal blanking.

• The image has N_C image columns, containing the picture. (Historically, some number of samples per active line (S_AL) were permitted to take values different from the blanking level.

The production aperture, sketched in Figure 8.3, comprises the array N_C columns by N_R rows. The samples in the production aperture comprise the pixel array; they are active. All other sample intervals comprise blanking; they are inactive or “blanked.”

VITS: Vertical interval test signal VITC: Vertical interval timecode

The vertical blanking interval in analog signals typi-cally carried vertical interval information such as VITS, VITC, and closed captions. Consumer display equip-ment must blank these lines. Both vertical and hori-zontal blanking intervals in a digital video interface may be used to convey ancillary (ANC) data such as audio.

The count of 480 picture lines in Figure 8.2 is a recent standard;

some people will quote numbers between 481 and 487. See Picture lines, on page 379.

All standard SD and HD image formats have N_C and N_R both even.

The horizontal center of the picture lies midway between the central two luma samples, and the vertical center of the picture lies vertically midway between the central two image rows.

Only pixels in the image array are represented in acquisition, processing, storage, and display. However, some processing operations (such as spatial filtering) use information in a small neighbourhood surrounding the subject pixel. In the absence of any better informa-tion, we take the pixel of the image array to lie on black. At the left-hand edge of the picture, if the video signal of the leftmost pixel has a value greatly different from black, an artifact called ringing is liable to result when that transition is processed through an analog or digital filter. A similar circumstance arises at the right-hand picture edge. In studio video, the signal builds to full amplitude, or decays to blanking level, over several transition samples ideally having a raised cosine

envelope.

See Transition samples, on page 378. Active samples encompass not only the picture, but also the transition samples; see Figure 8.4 above.

Studio equipment should maintain the widest picture possible within the production aperture, subject to appropriate blanking transitions.

I have treated the image array as an array of pixels, without regard for the spatial distribution of light

L_T N_R Figure 8.3 The Production

aper-ture comprises the image array, N_C columns by N_R rows.

Blanking intervals lie outside the production aperture; here, blanking intervals are darkly shaded. The product of N_C and N_R yields the active pixel count per frame. Sampling rate (f_S) is Figure 8.4 The Clean aperture

should remain subjectively free from artifacts arising from filtering. The clean aperture excludes blanking transition samples, indicated here by black bands outside the left and right edges of the picture width, defined by the count of samples per picture width (S_PW).

power across each pixel – the pixel’s spot profile, or more technically, point spread function (PSF). If the spot profile is such that there is a significant gap between the intensity distributions of adjacent image rows (scan lines), then image structure will be visible to viewers closer than a certain distance. The gap between scan lines is a function of image row (scan-line) pitch and spot profile. Spot size was historically characterized by spot diameter at 50% power. For a given image row pitch, a smaller spot size will force viewers to be more distant from the display if scan lines are to be rendered invisible.

Interlaced format

Interlacing is a scheme which – for given viewing distance, flicker sensitivity, and data rate – offered some increase in static spatial resolution over progressive scanning in historical CRT displays, which exhibited flicker. The full height of the image is scanned leaving gaps in the vertical direction. Then, ¹⁄₅₀ or ¹⁄₆₀ s later, the full image height is scanned again, but offset verti-cally so as to fill in the gaps. A frame thereby comprises two fields, denoted first and second. The scanning mechanism is depicted in Figure 8.5. Historically, the same scanning standard was used across an entire tele-vision system, so interlace was used not only for display but for the whole chain, including acquisition,

recording, processing, distribution, and transmission.

Noninterlaced (progressive) scanning is universal in desktop computers and in computing; also, progressive scanning has been introduced for digital television and

To refer to fields as odd and even invites confusion. Use first field and second field instead. Some people refer to scanning first the odd lines then the even; however, scan lines in interlaced video were historically numbered in temporal order, not spatial order: Scan lines are not numbered as if they were rows in the frame’s image matrix.

Confusion on this point among computer engineers – and confu-sion regarding top and bottom fields– has led to lots of improp-erly encoded video where the top and bottom offsets are wrong.

Figure 8.5 Interlaced format represents a complete picture – the frame– from two fields, each containing half of the total number of image rows. The second field is delayed by half the frame time from the first. This example shows 10 image rows.

In analog scanning, interlace is effected by having an odd number of total scan lines (e.,g., 525, 625, or 1125).

HD. However, the interlace technique remains universal in SD, and is widely used in broadcast HD. Interlace-to-progressive (I-P) conversion, also called deinterlacing, is an unfortunate but necessary by-product of interlaced scanning.

CRTs are now obsolete. The dominant display tech-nologies now used for video – LCD and plasma panels – have relatively long duty cycles, and they don’t flicker.

The raison d’être for interlace has vanished. Nonethe-less, interlace remains in wide use.

Twitter

The flicker susceptibility of vision stems from a wide-area effect: In a display such as a CRT that flashes, as long as the complete height of the picture is flashed sufficiently rapidly to overcome flicker, small-scale picture detail, such as that in the alternate lines, can be transmitted at a lower rate. With progressive scanning, scan-line visibility limits the reduction of spot size. With interlaced scanning, this constraint is relaxed by a factor of two. However, interlace introduced a new constraint, that of twitter.

If an image has vertical detail at a scale comparable to the image row pitch– for example, if the fine pattern of horizontal line pairs in Figure 8.6 is scanned– then interlaced display causes the content of the first and the second fields to differ markedly. At usual field rates – 50 or 60 Hz – this causes twitter, a small-scale phenomenon that is perceived as a scintillation, or an extremely rapid up-and-down motion. If such image information occu-pies a large area, then flicker is perceived instead of twitter. Twitter is sometimes called interline flicker;

however, flicker is by definition a wide-area effect, so interline flicker is a poor term.

Twitter is produced not only from degenerate images such as the fine black-and-white lines of Figure 8.6, but also from high-contrast vertical detail in ordinary images. High-quality video cameras include optical spatial lowpass filters to attenuate vertical detail that would otherwise be liable to produce twitter. When computer-generated imagery (CGI) is interlaced, vertical detail must be filtered in order to avoid flicker. Signal processing to accomplish this is called atwitter filter.

FIRST FIELD

SCANNING TEST SCENE

SECOND FIELDImage row pitch

Figure 8.6 Twitter would result if this scene were scanned at the indicated line pitch by a camera without vertical filtering, then displayed using interlace on a short duty cycle display such as a CRT.

Interlace in analog systems

In analog video, interlace was historically achieved by scanning vertically at a constant rate, typically 50 or 60 Hz, and scanning horizontally at an odd multiple of half that rate. In SD in North America and Japan, the field rate is 59.94 Hz; the line rate (f_H) is ⁵²⁵⁄₂ (262¹⁄₂) times that rate. In Asia, Australia, and Europe, the field rate is 50 Hz; the line rate is ⁶²⁵⁄₂ (312¹⁄₂) times that rate.

Figure 8.7 shows the horizontal drive (HD) and vertical drive (VD) pulse signals that were once distributed in the studio to cause interlaced scanning in analog equip-ment. These signals have been superseded by a com-bined sync (or composite sync) signal; vertical scanning is triggered by broad pulses having total duration of 2¹⁄₂ or 3 lines. Sync is usually imposed onto the video signal, to avoid separate distribution circuits. Analog sync is coded at a level “blacker than black.”

Interlace and progressive

For a given viewing distance, sharpness is improved as spot size becomes smaller. However, if spot size is reduced beyond a certain point, depending upon the spot profile of the display, either scan lines or pixels will become visible, or aliasing will intrude. In principle, improvements in bandwidth or spot profile reduce potential viewing distance, enabling a wider picture angle. However, because consumers form expectations about viewing distance, we assume a constant viewing distance and say that resolution is improved instead.

A rough conceptual comparison of progressive and interlaced scanning is presented in Figure 8.8 at the top of the facing page. At first glance, an interlaced system offers twice the number of pixels– loosely, twice the

0_V Figure 8.7 Horizontal and vertical drive pulses histori-cally effected interlace in analog scanning. 0_V denotes the start of each field. The halfline offset of the second 0_V causes interlace. Here, 576i scanning is shown.

Details are presented in Chapter 2, Analog SD sync, genlock, and inter-face, in Composite NTSC and PAL:

Legacy Video Systems, and in the first edition of the present book.

We’ll take up resolution in inter-laced systems on Interlace revisited, on page 105.

spatial resolution– as a progressive system with the same data capacity and the same frame rate. Owing to twitter, spatial resolution in a practical interlaced system is not double that of a progressive system at the same data rate. Historically, cameras have been

designed to avoid producing so much vertical detail that twitter would be objectionable. However, resolu-tion is increased by a factor large enough that interlace has historically been considered worthwhile. The improvement comes at the expense of introducing some aliasing and some vertical motion artifacts. Also, interlace makes it difficult to process motion sequences, as will be explained on page 93.

Examine the interlaced (bottom) portion of Figure 8.8, and imagine an image element moving slowly down the picture at a rate of one row of the

STATIC DYNAMIC

tn (¹/60 s) tn+1 (¹/60 s)

ProgressiveInterlaced

2 1

2 3 1 0

2 3 1

4 3

Line number (first field) Line number (second field)

Image rowImage row

Figure 8.8 Progressive and interlaced scanning are compared. The top left sketch depicts an image of 4×3 pixels transmitted during an interval of ¹⁄60 s. The top center sketch shows image data from the same 12 locations transmitted in the following ¹⁄60 s interval. The top right sketch. shows the spatial arrangement of the 4×3 image, totalling 12 pixels; the data rate is 12 pixels per ¹⁄60 s. At the bottom left, 12 pixels comprising image rows 0 and 2 of a 6×4 image array are transmitted in 1⁄60 s. At the bottom center, the 12 pixels of image rows 1 and 3 are transmitted in the following

Dans le document Digital Video and HD (Page 124-148)