Image structure 7 - Digital Video and HD

A naïve approach to digital imaging treats an image as a matrix of independent pixels, disregarding the spatial distribution of light power across each pixel. You might think that optimum image quality is obtained when there is no overlap between the distributions of neigh-boring pixels; many computer engineers hold this view.

However, continuous-tone images are best reproduced with a certain degree of overlap between pixels; sharp-ness is reduced slightly, but pixel structure is made less visible and image quality is improved.

Don’t confuse point spread function (PSF) with progressive segmented-frame (PsF), to be described on page 94.

The distribution of intensity across a displayed pixel is referred to as its point spread function (PSF). A one-dimensional slice through the center of a PSF is collo-quially called aspot profile. A display’s PSF influences the nature of the images it reproduces. The effects of a PSF can be analyzed using filter theory, discussed for one dimension in the chapter Filtering and sampling, on page 191, and for two dimensions in Image digitization and reconstruction, on page 237.

Historically, the PSFs of greyscale (“black-and-white”) CRTs were roughly Gaussian in shape: Intensity distribu-tion peaked at the center of the pixel, fell off over a small distance, and overlapped neighboring pixels to some extent. The scanning spot of colour CRTs had this shape, too; but the PSF was influenced by the shadow mask or aperture grille. The introduction of direct-view colour CRTs shifted the requirement for spatial filtering to the viewer: The assumption was introduced that the viewers were sufficiently distant from the screens that the viewers’ visual systems would perform the spatial integration necessary to obscure the triad structure.

Modern direct view fixed-pixel displays (FPDs) such as LCD and PDP displays have more or less uniform light emission over most of the area corresponding to each colour component (subpixel); their modulated light has a spatial structure comparable to that of a direct-view colour CRT, and similarly depends upon the viewers being located at a sufficient distance that their visual characteristics perform the spatial interga-tion necessary to obscure the triad structure.

A pixel whose intensity distribution uniformly covers a small square area of the screen has a point spread function referred to as a “box.”

Image reconstruction

Figure 7.1 reproduces a portion of an idealized bitmapped (bilevel) graphic image, part of a com-puter’s desktop display. Each sample is either black or white. The element with horizontal “stripes” is part of a window’s titlebar; the checkerboard background is intended to integrate to grey. Figure 7.1 shows recon-struction of the image with a “box” distribution. Each pixel is uniformly shaded across its extent; there is no overlap between pixels. This figure exemplifies a raster-locked image as displayed on an LCD. By raster-raster-locked, I refer to image data having the underlying image elements aligned with the pixel array.

A CRT’s electron gun produces an electron beam that illuminates a spot on the phosphor screen. The beam is deflected to form a raster pattern of scan lines that traces the entire screen, as I will describe in the following chapter. The beam is not perfectly focused when it is emitted from the CRT’s electron gun, and is dispersed further in transit to the phosphor screen. The intensity produced for each pixel at the face of the screen has a “bell-shaped” distribution resembling a two-dimensional Gaussian function. With a typical amount of spot overlap, the checkerboard area of this example will display as a nearly uniform grey as depicted in Figure 7.2. You might think that the blur caused by overlap between pixels would diminish image quality. However, for continuous-tone

(“contone”) images, some degree of overlap is not only desirable but necessary, as you will see from the following examples.

Figure 7.1 “Box” reconstruction of a bitmapped graphic image is shown.

Figure 7.2 Gaussian recon-struction is shown for the same bitmapped image as Figure 7.1. I will detail the one-dimensional AGaussian func-tion on page 200.

Figure 7.3 shows a 16×20-pixel image of a dark line slightly more than one pixel wide, 7.2° off the vertical.

At the left, the image data is reconstructed using a box distribution; a jagged and “ropey” nature is evident. At the right, the image data is reconstructed using a Gaus-sian. It is blurry, but less jagged.

Figure 7.4 shows two ways to reconstruct the same 16×20 pixels (320 bytes) of continuous-tone greyscale image data. The left-hand image is reconstructed using a box function, and the right-hand image with a Gaus-sian. The example was constructed so that each image is 4 cm (1.6 inches) wide. At typical reading distance of 40 cm (16 inches), a pixel subtends 0.4°, where visual acuity is near its maximum. At this distance, when reconstructed with a box function, the pixel structure of each image is highly visible; visibility of the pixel struc-ture overwhelms the perception of the image itself. The right image is reconstructed using a Gaussian distribu-tion. It is blurry, but easily recognizable as an American Figure 7.3 Diagonal line

recon-struction. At the left is a near-vertical line slightly more than 1 pixel wide, rendered as an array 20 pixels high that has been reconstructed using a box distribution. At the right, the line is reconstructed using a Gaussian distribution. Between the images I have placed a set of markers to indicate the vertical centers of the image rows.

Figure 7.4 Contone image recon-struction. At the left is a contin-uous-tone image of 16×20 pixels that has been reconstructed using a box distribution. The pictured individual cannot be recognized. At the right is exactly the same image data, but recon-structed by a Gaussian function.

The reconstructed image is very blurry but recognizable. Which reconstruction function do you think is best for continuous-tone imaging?

Visual acuity is detailed in Contrast sensitivity function (CSF), on page 251.

cultural icon. This example shows that sharpness is not always good, and blurriness is not always bad!

Figure 7.5 in the margin shows a 16×20-pixel image comprising 20 copies of the top row of Figure 7.3 (left).

Consider a sequence of 20 animated frames, where each frame is formed from successive image rows of Figure 7.3. The animation would depict a narrow vertical line drifting rightward across the screen at a rate of 1 pixel every 8 frames. If image rows of Figure 7.3 (left) were used, the width of the moving line would appear to jitter frame-to-frame, and the minimum light-ness would vary. With Gaussian reconstruction, as in Figure 7.3 (right), motion portrayal is much smoother.

Sampling aperture

In a practical image sensor, each element acquires infor-mation from a finite region of the image plane; the value of each pixel is a function of the distribution of intensity over that region. The distribution of sensi-tivity across a pixel of an image capture device is referred to as its sampling aperture, sort of a PSF in reverse – you could call it a point “collection” function.

The sampling aperture influences the nature of the image signal originated by a sensor. Sampling apertures used in continuous-tone imaging systems usually peak at the center of each pixel, fall off over a small distance, and overlap neighboring pixels to some extent.

In 1915, Harry Nyquist published a landmark paper stating that a sampled analog signal cannot be recon-structed accurately unless all of its frequency compo-nents are contained strictly within half the sampling frequency. This condition subsequently became known as the Nyquist criterion; half the sampling rate became known as the Nyquist rate. Nyquist developed his theorem for one-dimensional signals, but it has been extended to two dimensions. In a digital system, it takes at least two elements– two pixels or two scan-ning lines– to represent a cycle. Acycle is equivalent to aline pair of film, or two “TV lines” (TVL).

In Figure 7.6, the black square punctured by a reg-ular array of holes represents a grid of small sampling apertures. Behind the sampling grid is a set of a dozen black bars, tilted 14° off the vertical, representing image information. In the region where the image is sampled, Figure 7.5 One frame of an

animated sequence, recon-structed with a “box” filter.

Figure 7.6 A Moiré pattern is a form of aliasing in two dimensions that results when a sampling pattern (here the perforated square) has a sam-pling density that is too low for the image content (here the dozen bars, 14° off-vertical). This figure is adapted from Fig. 3.12 of Wandell’s Foundations of Vision (cited on page 195).

you can see three wide dark bars tilted at 45°. Those bars represent spatial aliases that arise because the number of bars per inch (or mm) in the image is greater than half the number of apertures per inch (or mm) in the sampling lattice. Aliasing can be prevented – or at least minimized – by imposing a spatial filter in front of the sampling process, as I will describe for one-dimen-sional signals in Filtering and sampling, on page 191, and for two dimensions in Image presampling filters, on page 242.

Nyquist explained that an arbitrary signal can be reconstructed accurately only if more than two samples are taken of the highest-frequency component of the signal. Applied to an image, there must be at least twice as many samples per unit distance as there are image elements. The checkerboard pattern in Figure 7.1 (on page 76) doesn’t meet this criterion in either the vertical or horizontal dimensions. Furthermore, the titlebar element doesn’t meet the criterion vertically.

Such elements can be represented in a bilevel image only when they are in precise registration – “locked” – to the imaging system’s sampling grid. However, images captured from reality almost never have their elements precisely aligned with the grid!

Point sampling refers to capture with an infinitesimal sampling aperture. This is undesirable in continuous-tone imaging. Figure 7.7 shows what would happen if a physical scene like that in Figure 7.1 were rotated 14°, captured with a point-sampled camera, and displayed with a box distribution. The alternating on-off elements are rendered with aliasing in both the checkerboard portion and the titlebar. (Aliasing would be evident even if this image were to be reconstructed with a Gaussian.) This example emphasizes that in digital imaging, we must represent arbitrary scenes, not just scenes whose elements have an intimate relationship with the sampling grid.

A suitable presampling filter would prevent (or at least minimize) the Moiré artifact of Figure 7.6, and prevent or minimize the aliasing of Figure 7.7. When image content such as the example titlebar and the desktop pattern of Figure 7.2 is presented to a presam-pling filter, blurring will occur. Considering only bitmapped images such as Figure 7.1, you might think Figure 7.7 Bitmapped

graphic image, rotated.

the blurring to be detrimental, but to avoid spatial aliasing in capturing high-quality continuous-tone imagery, some overlap is necessary in the distribution of sensitivity across neighboring sensor elements.

Having introduced the aliasing artifact that results from poor capture PSFs, we can now return to the display and discuss reconstruction PSFs (spot profiles).

Spot profile

The designer of a display system for continuous-tone images seeks to make a display that allows viewing at a wide picture angle, with minimal intrusion of artifacts such as aliasing or visible scan-line or pixel structure.

Picture size, viewing distance, spot profile, and scan-line or pixel visibility all interact. The display system designer cannot exert direct control over viewing distance; spot profile is the parameter available for optimization.

On page 77, I demonstrated the difference between a box profile and a Gaussian profile. Figures 7.3 and 7.4 showed that some overlap between neighboring distri-butions is desirable, even though blur is evident when the reproduced image is viewed closely.

When the images of Figure 7.3 or 7.4 are viewed from a distance of 10 m (33 feet), a pixel subtends a minute of arc (¹⁄₆₀°). At this distance, owing to the limited acuity of human vision, both pairs of images are apparently identical. Imagine placing beside these images an emissive display having an infinitesimal spot, producing the same total flux for a perfectly white pixel.

At 10 m, the pixel structure of the emissive display would be somewhat visible. At a great viewing distance – say at a pixel or scan-line subtense of less than ¹⁄₁₈₀°, corresponding to SD viewed at three times normal distance, or about 20·PH – the limited acuity of the human visual system causes all three displays to appear identical. As the viewer moves closer, different effects become apparent, depending upon spot profile.

I’ll discuss two cases: box distribution and Gaussian distribution.

Box distribution

A typical digital projector – such as an LCD or a PDP – has a spot profile resembling a box distribution covering nearly the entire width and nearly the entire height

corresponding to the pixel pitch. There is no significant gap between image rows or image columns. Each pixel has three colour components, but the optics of the projection device are arranged to cause the distribution of light from these components to be overlaid. From a great distance, pixel structure will not be visible.

However, as viewing distance decreases, aliasing (“the jaggies”) will intrude. Limited performance of projec-tion lenses mitigates aliasing somewhat; however, aliasing can be quite noticeable, as in the examples of Figures 7.3 and 7.4 on page 77.

In a typical direct-view digital display, such as an LCD or a PDP, each pixel comprises three colour compo-nents that occupy distinct regions of the area corre-sponding to each pixel. Ordinarily, these components are side-by-side. There is no significant gap between image rows. However, if one component (say green) is turned on and the others are off, there is a gap between columns. These systems rely upon the limited acuity of the viewer to integrate the components into a single coloured area. At a close viewing distance, the gap can be visible, and this can induce aliasing.

The viewing distance of a display using a box distri-bution, such as a direct-view LCD or PDP, is limited by the intrusion of aliasing.

Gaussian distribution

As I have mentioned, a CRT display has a spot profile resembling a Gaussian. The CRT designer’s choice of spot size involves a compromise illustrated by Figure 7.8.

• For a Gaussian distribution with a very small spot, say a spot width less than ¹⁄₂ the scan-line pitch, line struc-ture will become evident even at a fairly large viewing distance.

• For a Gaussian distribution with medium-sized spot, say a spot width approximately equal to the scan-line pitch, the onset of scan-line visibility will occur at a closer distance than with a small spot.

• As spot size is increased beyond about twice the scan-line pitch, eventually the spot becomes so large that no further improvement in line-structure visibility is achieved by making it larger. However, there is a ser-Figure 7.8 Gaussian spot size.

Solid lines graph Gaussian distributions of intensity across two adjacent image rows, for three values of spot size. The areas under each curve are identical. The shaded areas indicate their sums. In progres-sive scanning, adjacent image rows correspond to consecu-tive scan lines. In interlaced scanning, the situation is more complex.

ious disadvantage to making the spot larger than neces-sary: Sharpness is reduced.

You saw at the beginning of this chapter that in order to avoid visible pixel structure in image display some overlap is necessary in the distributions of light produced by neighboring display elements. Such overlap reduces sharpness, but by how much? How much overlap is necessary? I will discuss these issues in the Chapter Resolution, on page 97. First, though, I will introduce the fundamentals of raster scanning.

Dans le document Digital Video and HD (Page 116-124)