Frame-Buffer Operations - GPU Programming

GPU Programming

2.4 Frame-Buffer Operations

The fragments produced by rasterization are written into theframe buffer, a two-dimensional array of pixel attributes (color, alpha, depth) that cor-responds to the final image. The color portion of the frame buffer is finally displayed on the video screen. When an incoming fragment is written, it modifies the values already contained in the frame buffer according to a number of parameters and conditions. The sequence of available tests and modifications is termedframe-buffer operationsorfragment operationsand comprise the following.

Alpha Test. The alpha test allows a fragment to be discarded conditional on the outcome of a comparison between the fragment’s opacityαand a speciﬁed reference value. The alpha test can be useful in many ways, but the original idea was to discard fragments that are completely transparent. If the alpha test fails, the read and write operations from/to the frame buﬀer can be skipped.

Stencil Test. The stencil test allows a per-pixel mask to be applied to the visible frame buffer. The mask is contained in a separate portion of the frame buffer, called thestencil buffer, and is usually rendered in a pre-processing step. The stencil test conditionally drops a fragment if the stencil buffer is set for the corresponding pixel.

Depth Test. Because primitives are generated in arbitrary sequence, the depth test is needed to provide an effective mechanism for correct depth ordering of partially occluded objects. The depth value of a fragment is therefore stored in a so-called depth buffer. The depth test checks if an incoming fragment is occluded by a fragment that has been previously written. The occlusion test compares the incoming depth value to the value already stored in the depth buffer. This test allows occluded fragments to be discarded immediately. Because this decision is made according to the z-value of a fragment in screen space, the depth test is often referred to asz-testor z-culling.

Alpha Blending. To allow for semi-transparent objects, alpha blending combines the color of the incoming fragment with the color of the

2.4 Frame-Buffer Operations 43 corresponding pixel currently stored in the frame buﬀer. We will see diﬀerent blending set-ups in Chapter 3.

After the scene description has completely passed through the graphics pipeline, the resulting raster image contained in the frame buﬀer can be displayed on the screen or read back into main memory and saved to disk.

2.4.1 Early Z-Test

As mentioned above, the depth test discards all fragments that are oc-cluded by previously drawn fragments according to a comparison of their z-values. The depth test is part of the frame buﬀer operations, which are performed after fragment processing. If the computation done in the fragment program, however, is rather expensive, it might be ineﬃcient to perform fragment processing at all if we know in advance that the resulting fragment will be discarded afterwards.

In consequence, many modern GPUs allow the depth test to be per-formed before the fragment program execution. This concept is known as early z-test. The programmer, however, does not have explicit control over this feature. Instead, the graphics driver automatically decides whether an early z-test is feasible or not. The decision is made internally based on hardware-specific criteria. One basic condition for activating the early z-test is that the fragment program does not modify the z-value of the frag-ment. Some hardware architectures also decide to activate the early z-test only if some or all other fragment tests are disabled. For rendering scenes with a large overdraw due to a high depth complexity, the early z-test is an efficient means of increasing the rendering speed. For the early z-test to work most efficiently, however, it is mandatory to draw the objects in front-to-back order as possible.

2.4.2 Offscreen Buffers and Multiple Render Targets

For many advanced rendering algorithms, it is necessary to generate tex-tures or intermediate images on-the-fly. These intermediate images are not directly displayed onscreen. Instead, they are used as texture images in successive rendering passes. Rendering intermediate results into a texture in OpenGL traditionally required copying the frame-buffer content to the texture using calls to glCopyTexImage2D. To circumvent resolution prob-lems and performance penalties that arise from the copy operation, addi-tional offscreen buffers in local video memory have been introduced. Such offscreen buffers can be used as alternative render targets to the visible frame buffer. Up until recently, the standard method for offscreen ren-dering was the pixel buffer, or pbuffer. In combination with theOpenGL

extensionWGL ARB render texture(or similar extensions for Unix-style sys-tems), which allows pbuffers to be bound directly as texture, this was an effective, yet heavyweight solution to generate texture images on-the-fly.

The main drawbacks of pbuffers are the inconvenient requirement of unique OpenGL contexts, expensive context switching, platform depen-dence, and limited flexibility. In response to these drawbacks, frame-buffer objects (FBOs) have been introduced with theOpenGL extension GL EXT framebuffer object. FBOs are a more flexible and lightweight so-lution to platform-independent, offscreen render targets, and they do not require separateOpenGLcontexts. For volume graphics, FBOs are of great interest, because they allow us to directly render into z-slices of 3D tex-tures. We will utilize this feature for creating 3D textures on-the-fly in Chapter 12. FBOs also provide an interface to floating-point render tar-gets, which do not clamp pixel colors to unit range. Although floating-point rendering buffers cannot directly be displayed on the screen, they are im-portant for implementing tone-mapping techniques for high dynamic range rendering as we will see in Chapter 5.

Another important feature of modern GPUs is the support formultiple render targets (MRTs). They allow fragment shaders to output multiple color values at one time and write them into separate offscreen render tar-gets of the same resolution. MRTs are implemented as a separate OpenGL extension GL ARB draw buffers, and FBOs provide a flexible interface to them. They can be used to efficiently generate multiple renditions in a single rendering pass.

2.4.3 Occlusion Queries

Another very useful and important feature of modern graphics hardware is the possibility to perform so-calledocclusion queries. As we have seen in Section 2.4, not all of the fragments created during rasterization finally end up as pixels in the frame buffer. Depending on the configuration of the individual per-fragment tests, a significant number of fragments may be discarded. Occlusion queries allow an application to count the number of fragments that are actually passing all the tests.

The main purpose of this mechanism is to determine the visibility of a group of primitives. For example, an application might utilize an occlusion query to check whether or not the bounding box of a complex geometry is visible. If the rasterization of the bounding box returns an insigniﬁcant number of fragments, the application might decide to completely skip the rendering of the complex geometry.

Occlusion queries are implemented by the OpenGL extension GL ARB occlusion query. A code example is given in Section 8.5 in the context of occlusion culling.

2.5 Further Reading 45

2.5 Further Reading

There are many excellent introductory texts on graphics and shader pro-gramming. If you are looking for a general source of information on real-time graphics, we recommend the bookReal-Time Renderingby Akenine-M¨oller and Haines [2], which provides a practical overview on the current state of the art. For readers focusing more on game development, the ﬁrst volume of 3D Games by Watt and Policarpo [283] might also be an alternative.

TheOpenGL Programming Guide [240], commonly known as the Red Book, is a must-have for everybody concerned with graphics programming in OpenGL. Make sure you have an up-to-date edition on your shelf for reference. Another very recommendable book is Advanced Graphics Pro-gramming in OpenGLby McReynolds and Blythe [184]. They provide deep insights intoOpenGLthat go far beyond the programming manual.

The developer’s toolkit for the high-level shading languageCgis freely available for Windows and Linux at NVIDIA’s developer website [33]. As a developer’s guide toCg, we recommendThe Cg Tutorialbook by Fernando and Kilgard [71]. This is an excellent book for learningCg in addition to theCg User Manualincluded in theCg Toolkit.

The Internet is a huge source of information on graphics and shader development in general. The oﬃcialOpenGL website,http://www.opengl.

org, is always a good starting point. Additionally, all major manufacturers of graphics boards maintain a developer website with software development kits, white papers, code samples, and demos. Everybody involved in GPU programming is well advised to regularly visit the developer sites athttp://

www.ati.comandhttp://www.nvidia.comto look for new hardware features and other improvements.

3 Basic GPU-Based

Dans le document Real-Time Volume Graphics (Page 60-65)