Basic Blocks, Reducible Code, and Optimization

In order to write HLL code that produces efficient machine code, you really need

5.4 The Translation Process

5.4.3 Intermediate Code Generation

5.4.4.3 Basic Blocks, Reducible Code, and Optimization

Writing great code that works synergistically with your compiler’s optimizer requires a basic understanding of the optimization process. In this section I will discuss how a compiler organizes the intermediate code it produces in order to do a good job during the later optimization phases. The way you write your HLL source code has a profound effect on the compiler’s ability to organize the intermediate code (to produce better machine code), so understanding how the compiler does this is very important if you want to be able to help control the operation of the compiler’s optimizer.

When it analyzes code, a compiler’s optimizer will keep track of variable values as control flows through the program. The process of tracking this information is known as data flow analysis. After careful data flow analysis, a compiler can determine where a variable is uninitialized, when the variable contains certain values, when the program no longer uses the variable, and (just as importantly) when the compiler simply doesn’t know anything about the variable’s value. For example, consider the following Pascal code:

path := 5;

if( i = 2 ) then begin

writeln( 'Path = ', path );

end;

i := path + 1;

if( i < 20 ) then begin path := path + 1;

i := 0;

end;

A good optimizer will replace this code with something like the following:

if( i = 2 ) then begin

(* Because the compiler knows that path = 5 *)

writeln( 'path = ', 5 );

end;

i := 0; (* Because the compiler knows that path < 20 *) path := 6; (* Because the compiler knows that path < 20 *)

In fact, the compiler probably would not generate code for the last two statements; instead, it would substitute the value 0 for i and 6 for path in later references. If this seems impressive to you, just note that some compilers can track constant assignments and expressions through nested function calls and complex expressions.

Although a complete description of how a compiler achieves this is beyond the scope of this book, you should have a basic understanding of how compilers keep track of variables during the optimization phase because a sloppily written program can thwart the compiler’s optimization abilities.

Great code works synergistically with the compiler, not against it.

Some compilers can do some truly amazing things when it comes to opti-mizing high-level code. However, you should note one thing: optimization is an inherently slow process. As noted earlier, optimization is provably an intractable problem. Fortunately, most programs don’t require full optimiza-tion. A good approximation of the optimal program, even if it runs a little slower than the optimal program, is an acceptable compromise when compared to intractable compilation times.

The major concession to compilation time that compilers make during optimization is that they search for only so many possible improvements to a section of code before they move on. Therefore, if your programming style tends to confuse the compiler, it may not be able to generate an optimal (or even close to optimal) executable because the compiler has too many possibilities to consider. The trick is to learn how compilers optimize the source file so you can accommodate the compiler.

To analyze data flow, compilers divide the source code into sequences known as basic blocks. A basic block is a sequence of sequential machine instructions into and out of which there are no branches except at the begin-ning and end of the basic block. For example, consider the following C code:

x = 2; // Basic Block 1 j = 5;

i = f( &x, j );

j = i * 2 + j;

if( j < 10 ) // End of Basic Block 1 {

j = 0; // Basic Block 2 i = i + 10;

x = x + i; // End of Basic Block 2

} else {

temp = i; // Basic Block 3 i = j;

j = j + x;

x = temp; // End of Basic Block 3 }

x = x * 2; // Basic Block 4 ++i;

--j;

// End of Basic Block 4

printf( "i=%d, j=%d, x=%d\n", i, j, x );

// Basic Block 5 begins here

This code snippet contains four basic blocks. Basic block 1 starts with the beginning of the source code. A basic block ends at the point where there is a jump into or out of the sequence of instructions. Basic block 1 ends at the beginning of the if statement because the if can transfer control to either of two locations. The else clause terminates basic block 2. It also marks the beginning of basic block 3 because there is a jump (from the if’s then clause) to the first statement following the else clause. Basic block 3 ends, not because the code transfers control somewhere else, but because there is a jump from basic block 2 to the first statement that begins basic block 4 (from the if’s then section). Basic block 4 ends with a call to the C printf function.

The easiest way to determine where the basic blocks begin and end is to consider the assembly code that the compiler will generate for that code.

Wherever there is a conditional branch/jump, unconditional jump, or call instruction, a basic block will end. Note, however, that the basic block includes the instruction that transfers control to a new location. A new basic block begins immediately after the instruction that transfers control to a new loca-tion. Also, note that the target label of any conditional branch, unconditional jump, or call instruction begins a basic block.

The nice thing about basic blocks is that it is easy for the compiler to track what is happening to variables and other program objects in a basic block. As the compiler processes each statement, it can (symbolically) track the values that a variable will hold based upon their initial values and the computations on them within the basic block.

A problem occurs when the paths from two basic blocks join into a single code stream. For example, at the end of basic block 2 in the current example, the compiler could easily determine that the variable j contains zero because code in the basic block assigns the value zero to j and then makes no other assignments to j. Similarly, at the end of basic block 3, the program knows that j contains the value j0+x (assuming j0 represents the initial value of j upon entry into the basic block). But when the paths merge at the beginning of basic block 4, the compiler probably can’t determine whether j will contain zero or the value j0+x. So, the compiler has to note that j’s value could be

either of two different values at this point. While keeping track of two possible values that a variable might contain at a given point is easy for a decent opti-mizer, it’s not hard to imagine a situation where the compiler would have to keep track of many different possible values. In fact, if you have several if statements that the code executes in a sequential fashion, and each of the paths through these if statements modifies a given variable, then the number of possible values for each variable doubles with each if statement. In other words, the number of possibilities increases exponentially with the number ofif statements in a code sequence. At some point, the compiler cannot keep track of all the possible values a variable might contain, so it has to stop keeping track of that information for the given variable. When this happens, there are fewer optimization possibilities that the compiler can consider.

Fortunately, although loops, conditional statements, switch/case state-ments, and procedure/function calls can increase the number of possible paths through the code exponentially, in practice compilers have few prob-lems with typical well-written programs. This is because as paths from basic blocks converge, programs often make new assignments to their variables (thereby eliminating the old information the compiler was tracking). Com-pilers generally assume that programs rarely assign a different value to a variable along every distinct path in the program, and their internal data structures reflect this. So keep in mind that if you violate this assumption, the compiler may lose track of variable values and generate inferior code as a result.

Compiler optimizers are generally written to handle well-written programs in the source language. Poorly structured programs, however, can create con-trol flow paths that confuse the compiler, reducing the opportunities for optimization.

Good programs produce reducible flow graphs. A flow graph is a pictorial depiction of the control flow through the program. Figure 5-5 is a flow graph for the previous code fragment.

As you can see, arrows connect the end of each basic block with the beginning of the basic block into which they transfer control. In this partic-ular example, all of the arrows flow downward, but this isn’t always the case.

Loops, for example, transfer control backward in the flow graph. As another example, consider the following Pascal code:

write( "Input a value for i:" );

readln( i );

j := 0;

while( j < i and i > 0 ) do begin a[j] := i;

b[i] := 0;

j := j + 1;

i := i - 1;

end; (* while *) k := i + j;

writeln( 'i = ', i, 'j = ', j, 'k = ', k );

Figure 5-5: An example flow graph

Figure 5-6 shows the flow graph for this simple code fragment.⁴

Well-structured programs have flow graphs that are reducible. Although a complete description of what a reducible flow graph consists of is beyond the scope of this book, any program that consists only of structured control statements (if,while,repeat..until, etc.) and avoids gotos will be reducible (actually, the presence of a goto statement won’t necessarily produce a pro-gram that is not reducible, but propro-grams that are not reducible generally have goto statements in them). This is an important issue because compiler optimizers can generally do a much better job when working on reducible programs. In contrast, programs that are not reducible tend to confuse optimizers.

What makes reducible programs easier for optimizers to deal with is that the basic blocks in such a program can be collapsed in an outline fashion with enclosing blocks inheriting properties (for example, which variables the block modifies) from the enclosed blocks. By processing the source file in an outline fashion, the optimizer can deal with a small number of basic blocks, rather than a large number of statements. This hierarchical approach to optimization is more efficient and allows the optimizer to maintain more information about the state of a program. Furthermore, the exponential

4This flow graph has been somewhat simplified for purposes of clarity. This simplification does not affect the discussion of basic blocks.

x = 2;

j = 5;

i = f( &x, j );

j = i * 2 + j;

if( j < 10 )

{

j = 0;

i = i + 10;

x = x + i;

}

else {

temp = i;

i = j;

j = j + x;

x = temp;

}

x = x * 2;

++i;

--j;

printf ( "i=%d, j=%d, x=%d\ n", i, j, x );

Figure 5-6: Flow graph for a while loop

time complexity of the optimization problem works for us in this case. By reducing the number of blocks the code has to deal with (using reduction), you dramatically reduce the amount of work the optimizer must do. Again, the exact details of how the compiler achieves this are not important here.

The important thing to note is that if you avoid goto statements and other bizarre control transfer algorithms in your programs, your programs will usually be reducible, and the optimizer will be able to do a better job of optimizing your code.

Attempts to “optimize” your code by sticking in lots of goto statements to avoid code duplication and to avoid the execution of unnecessary tests may actually work against you. While you may save a few bytes or a few cycles in the immediate area you’re working on, the end result might also sufficiently confuse the compiler so that it cannot do a good job of global optimization, producing an overall loss of efficiency.

Dans le document CODE WRITE GREAT (Page 94-99)