• Aucun résultat trouvé

The Importance of Hidden Constants

Part 2 The Software Side: Disappointments and

9 Asymptotic Complexities and the Selection

9.2 The Importance of Hidden Constants

We have already argued that for software, it is highly unusual to have bit complexity as a valid concept. Virtually everything of practical importance is based on word complexity. This holds for both time and space complexity.

Assume now that we have obtained f(n) as the complexity of an algorithm, for n some measure of the input. If this is the space complexity of the algorithm, then the memory requirements of a program implementing that algorithm are essentially f(n) + Csp, where the constant Csp accounts for the space required for the program, for the symbol tables, and for other infor-mation associated with the program. This constant Csp is independent of the measure n of the input to the program. Thus, the space complexity of the program is closely related to that of the underlying algorithm, provided space is measured in words.2 We reiterate that space requirements should always be based on worst-case analyses (see Section 6.2); average space complexity has a limited usefulness for software.

1 As pointed out, there are infinitely many complexity classes between (and beyond) these eight.

For example, .n is strictly between ϕ5(n) and ϕ6(n), that is, .n = O(ϕ6(n)), ϕ6(n) O( .n), ϕ5(n) = O( .n), and .nO(ϕ5(n)). However, for most practical purposes, these eight are gen-erally considered sufficient to categorize complexity functions.

2 If bit complexity is used for the algorithm, the actual space requirement of the program depends on the way these bits are represented. There is a good deal of variability, from using an entire word for each bit to using packed structures.

2 n

2n 2n 2n

2 n 2 n

C6730_C009.fm Page 190 Monday, July 24, 2006 12:29 PM

Asymptotic Complexities and the Selection of Algorithms 191 The relationship between the complexity of the algorithm and that of the corresponding program is not quite as clean when it comes to time. Recall that the time complexity of an algorithm is the statement count for the algorithm, in essence, each statement accounts for one unit of time. A pro-gram’s time requirements are not quite that easily captured. By and large, we end up with c1.f(n) + c2, where the constant c1 measures the duration3 of an average statement and the constant c2 reflects the amount of time required to load the program and initialize the processes associated with it. Each of the two constants hides a good deal of work.

The difficulty with the constant c1 is the assumption that we know what an average statement is. We can make some educated guesses or we can determine a range for this constant. The most systematic approach is to base the value of c1 on some limited test runs of the program at hand. In practice, c1 will also depend on the target platform (thus, it is related not just to the architecture, but also to the instruction set and the ability of the compiler to exploit efficiently the instruction set). Generally, a reasonably acceptable value for c1 is acquired experimentally. Nevertheless, the precise value of this constant depends on the program to be executed. Realistically we can only hope for a reasonably small range.4

The constant c2 is a measure of the fixed cost of program execution. In other words, even if virtually no statements are executed, the amount of time c2 must always be expended. A typical situation where this might occur is a wrong user input that causes the program to abort. It is important to understand that c2 is definitely not 0. In fact, its value can be quite substantial.

However, it is a time penalty that will always be incurred, so it may appear to be insignificant. While this is certainly not true for most programs, it does have the advantage that there are never any surprises; we always must take at least c2 time, even if nothing happens.

In the next section we will discuss crossover points. These are particularly important when comparing two algorithms, and then the corresponding programs, according to some complexity measure. Here we explore a slightly different issue. Assume we have two programs with the same asymptotic (time) complexity ϕi(n). The decision of which program to use will first hinge on the constant factors for each program. However, let us assume that both have comparable factors. We may encounter the following situation. One algorithm assumes that n is a power of 2; the complexity analysis is based on that assumption and if n happens not to be a power of 2, the algorithm simply assumes that we pad the input so that the assumption is again satisfied. The other algorithm works for any value of n. In this case, it may

3 We are deliberately vague about the unit. One approach might be to use actual time, for exam-ple in nanoseconds. Another approach would be to assume that the unit involved is a synthetic one that allows us to maintain the simple idea of a unit statement. In this case, we would still be looking at some type of statement count, except that we now take into consideration the actual duration of this average statement.

4 In practice, one hopes that the execution time of a program is within a range of one half of the predicted quantity and double that quantity.

C6730_C009.fm Page 191 Monday, July 24, 2006 12:29 PM