• Aucun résultat trouvé

Types of Computer Language Processors

Dans le document CODE WRITE GREAT (Page 85-88)

In order to write HLL code that produces efficient machine code, you really need

5.3 Types of Computer Language Processors

We can generally place computer language systems into one of four categories:

pure interpreters, interpreters, incremental compilers, and compilers. These systems differ in how they process the source program and execute the result, which affects the efficiency of the execution process.

5.3.1 Pure Interpreters

Pure interpreters operate directly on a text source file and tend to be very inefficient. An interpreter continuously scans the source file (usually an ASCII text file), processing it as string data. Recognizing lexemes (language components such as reserved words, literal constants, and the like) consumes time. Indeed, many interpreters spend more time processing the lexemes (that is, performing lexical analysis) than they do actually executing the pro-gram. Pure interpreters tend to be the smallest of the computer language processing programs. This is because every language translator has to do lexical analysis, and the actual on-the-fly execution of the lexeme takes only a little additional effort. For this reason, pure interpreters are popular where a very compact language processor is desirable. Pure interpreters are also popular for scripting languages and very high-level languages that let you manipulate the language’s source code as string data during program execution.

5.3.2 Interpreters

An interpreter executes some representation of a program’s source file at run-time. This representation isn’t necessarily a text file in human-readable form.

As noted in the previous section, many interpreters operate on tokenized source files in order to avoid lexical analysis during execution. Some interpret-ers read a text source file as input and translate the input file to a tokenized form prior to execution. This allows programmers to work with text files in their favorite editor while enjoying fast execution using a tokenized format.

The only costs are an initial delay to tokenize the source file (which is unnoticeable on most modern machines) and the fact that it may not be possible to execute strings as program statements.

5.3.3 Compilers

Acompiler translates a source program in text form into executable machine code. This is a complex process, particularly in optimizing compilers. There are a couple of things to note about the code a compiler produces. First, a compiler produces machine instructions that the underlying CPU can directly execute. Therefore, the CPU doesn’t waste any cycles decoding the source file while executing the program—all of the CPU’s resources are dedicated to executing the machine code. As such, the resulting program generally runs many times faster than an interpreted version does. Of course, some compilers do a better job of translating HLL source code into machine code than other compilers, but even low-quality compilers do a better job than most interpreters.

A compiler’s translation from source code to machine code is a one-way function. It is very difficult, if not impossible, to reconstruct the original source file if you’re given only the machine code output from a program.

(By contrast, interpreters either operate directly on source files or work with tokenized files from which it’s easy to reconstruct some semblance of the source file.)

5.3.4 Incremental Compilers

An incremental compiler is a cross between a compiler and an interpreter.1 There is no single definition of an incremental compiler because there are many different types of incremental compilers. In general, though, like an interpreter, an incremental compiler does not compile the source file into machine code. Instead, it translates the source code into an intermediate form. Unlike interpreters, however, this intermediate form does not usually exhibit a strong relationship to the original source file. This intermediate form is generally the machine code for a virtual (hypothetical) machine language.

That is, there is no real CPU that can execute this code. However, it is easy to write an interpreter for such a virtual machine, and that interpreter does the actual execution. Because interpreters for virtual machines are usually much more efficient than interpreters for tokenized code, the execution of this virtual machine code is usually much faster than the execution of a list of tokens in an interpreter. Languages like Java use this compilation technique, along with a Java byte code engine (an interpreter program, see Figure 5-1) that interpretively executes the Java “machine code.” The big advantage to virtual machine execution is that the virtual machine code is portable; that is, pro-grams running on the virtual machine can execute anywhere there is an interpreter available. True machine code, by contrast, only executes on the CPU (family) for which it was written. Generally, interpreted virtual machine code runs about two to ten times faster than interpreted code, and pure machine code generally runs anywhere from two to ten times faster than interpreted virtual machine code.

Figure 5-1: The Java byte-code interpreter

In an attempt to improve the performance of programs compiled via an incremental compiler, many vendors (particularly Java systems vendors) have resorted to a technique known as just-in-time compilation. The concept is based

1Actually, in recent years the term incremental compiler has taken on another meaning as well—

the ability to compile pieces of the program and recompile them as necessary (given changes in the source file). We will not consider such systems here.

Java byte codes in sequential memory locations

Java byte-code interpreter (typically written in C)

Actions specified by the execution of the Java byte-code instructions

Computer system memory

on the fact that the time spent in interpretation is largely consumed by fetch-ing and decipherfetch-ing the virtual machine code at runtime. This interpretation occurs repeatedly as the program executes. Just-in-time compilation translates the virtual machine code to actual machine code whenever it encounters a virtual machine instruction for the first time. By doing so, the interpreter is spared the interpretation process the next time it encounters the same state-ment in the program (e.g., in a loop). Although just in time compilation is nowhere near as good as a true compiler, it can typically improve the per-formance of a program by a factor of two to five times.

An interesting note about older compilers and some freely available com-pilers is that they would compile the source code to assembly language and then you would have to have a separate compiler, known as an assembler, to assemble this output to the machine code wanted. Most modern and high efficient compilers, skip this step altogether. See Section 5.5, “Compiler Output,” for more on this subject.

This chapter describes how compilers generate machine code. By understanding how a compiler generates machine code, you can choose appropriate HLL statements to generate better, more efficient machine code. If you want to improve the performance of programs written with an interpreter or incremental compiler, the best advice you can follow is to use an optimizing compiler to process your application. For example, GNU provides a compiler for Java that produces optimized machine code rather than interpreted Java byte code; the resulting executable files run much faster than interpreted Java byte code.

Dans le document CODE WRITE GREAT (Page 85-88)