Using a Debugger to Analyze Compiler Output

In order to write great code, you’ve got to recognize the difference between

6.5 Using a Debugger to Analyze Compiler Output

Another option you can use to analyze compiler output is a debugger program, which usually incorporates a disassembler that you can use to view machine instructions. Using a debugger to view the code your compiler emits can either be an arduous task or a breeze—depending on the debugger you use. Typically, if you use a stand-alone debugger, you’ll find that it takes considerably more effort to analyze your compiler’s output than if you use a debugger built into a compiler’s integrated development environ-ment. This section looks at both approaches.

6.5.1 Using an IDE’s Debugger

The Microsoft Visual C++ environment provides excellent tools for observing the code produced by a compilation (of course, the compiler also produces assembly output, but that can be ignored here). To view the output using the Visual Studio debugger, first compile your C/C++ program to an executable file and then select Build Start Debug Step Into from the Visual Studio Build menu. When the debug palette comes up, click the disassembly view button (see Figure 6-4).

Figure 6-4: Visual Studio disassembly view button

Pressing the disassembly view button brings up a window with text similar to the following:

19: {

00401010 push ebp 00401011 mov ebp,esp 00401013 sub esp,48h 00401016 push ebx 00401017 push esi 00401018 push edi

00401019 lea edi,[ebp-48h]

0040101C mov ecx,12h 00401021 mov eax,0CCCCCCCCh 00401026 rep stos dword ptr [edi]

20: int i,j;

21:

22: i = argc & j;

00401028 mov eax,dword ptr [ebp+8]

0040102B and eax,dword ptr [ebp-8]

0040102E mov dword ptr [ebp-4],eax 23: printf( "%d", i );

00401031 mov ecx,dword ptr [ebp-4]

00401034 push ecx

00401035 push offset string "%d" (0042001c) 0040103A call printf (00401070)

0040103F add esp,8 24: return 0;

00401042 xor eax,eax 25: }

00401044 pop edi 00401045 pop esi 00401046 pop ebx 00401047 add esp,48h 0040104A cmp ebp,esp

0040104C call __chkesp (004010f0) 00401051 mov esp,ebp

00401053 pop ebp 00401054 ret

Of course, because Microsoft’s Visual C++ package is already capable of outputting an assembly language file during compilation, using the Visual Studio integrated debugger in this manner isn’t necessary. However, some compilers do not provide assembly output, and debugger output may be the easiest way to view the machine code the compiler produces. For example,

This button selects disassembly output

Borland’s Delphi compiler does not provide an option to produce assembly language output. Given the massive amount of class library code that Delphi links into an application, attempting to view the code for a small section of your program by using a disassembler would be like trying to find a needle in a haystack. A better solution is to use the debugger built into the Delphi environment.

To view the machine code that Delphi (or Kylix under Linux) generates, set a breakpoint in your code where you wish to view the machine code, and then compile and run your application. When the breakpoint triggers, select View Debug Windows CPU. This brings up a CPU window like the one appearing in Figure 6-5. From this window you can examine the compiler’s output and adjust your HLL code to produce better output.

Figure 6-5: Delphi CPU window

6.5.2 Using a Stand-Alone Debugger

If your compiler doesn’t provide its own debugger as part of an integrated development system, another alternative is to use a separate debugger such as OllyDbg, DDD, or GDB to disassemble your compiler’s output. Simply load the executable file into the debugger for normal debugging operations.

Most debuggers that are not associated with a particular programming language are machine-level debuggers that disassemble the binary machine code into machine instructions for viewing during the debugging operation.

One problem with using machine-level debuggers is that locating a particular section of code to disassemble can be difficult. Remember, when you load the entire executable file into a debugger, you also load all the statically linked library routines and other runtime support code that don’t normally appear in the application’s source file. Searching through all this extraneous code to find out how the compiler translates a particular sequence of statements to

machine code can be time-consuming. Some serious code sleuthing may be necessary. Fortunately, most linkers collect all the library routines together and place them either at the beginning or end of the executable file. There-fore, the code associated with your application will generally appear near the beginning or end of the executable file. Nevertheless, if the application is large, finding a particular function or code sequence among all the code in that application can be difficult.

Debuggers generally come in one of three different flavors: pure machine-level debuggers, symbolic debuggers, and source-level debuggers.

Symbolic debuggers and source-level debuggers require executable files to contain special debugging information² and, therefore, the compiler must specifically include this extra information.

Pure machine-level debuggers have no access to the original source code or symbols in the application. A pure machine-level debugger simply disassembles the machine code found in the application and displays the listing using literal numeric constants and machine addresses. Reading through such code is difficult, but if you understand how compilers generate code for the HLL statements (as this book will teach you), then locating the machine code is easier. Nevertheless, without any symbolic information to provide a “root point” in the code, analysis can be difficult.

Symbolic debuggers use special symbol table information found in the executable file (or a separate debugging file, in some instances) to associate labels with functions and, possibly, variable names in your executable file.

This feature makes locating sections of code within the disassembly listing much easier. When symbolic labels identify calls to functions, it’s much easier to see the correspondence between the disassembled code and your original HLL source code. One thing to keep in mind, however, is that symbolic information is only available if the application was compiled with debugging mode enabled. Check your compiler’s documentation to deter-mine how to activate this feature for use with your debugger.

Source-level debuggers will actually display the original source code associated with the file the debugger is processing. In order to see the actual machine code the compiler produced, you often have to activate a special machine-level view of the program. As with symbolic debuggers, your compiler must produce special executable files (or auxiliary files) containing debug information that a source-level debugger can use. Clearly, source-level debug-gers are much easier to work with because they show the correspondence between the original HLL source code and the disassembled machine code.

Dans le document CODE WRITE GREAT (Page 171-174)