Recursion and Space Complexity - Implications of Compiler and Systems Issues

Part 2 The Software Side: Disappointments and

6 Implications of Compiler and Systems Issues

6.2 Recursion and Space Complexity

Recursion is an elegant tool that allows us to demonstrate solutions of complex problems.² Recursion is, however, not always as efficient as one might expect. Apart from the already touched upon problems of the clear abuse of recursion (e.g., computing the Fibonacci numbers recursively — with an exponential time complexity, even though a halfway decent iterative implementation would take no more than linear time), recursion also has implications for the space complexity that are easy to ignore. While any rigorous space complexity analysis will account for the memory require-ments resulting from recursion, in particular, the recursion stack, there are more subtle implications for programs.

While it is convenient to start with a recursive formulation of an algorithm, and in some cases virtually unavoidable (see the Towers of Hanoi problem or AVL trees), when implementing an algorithm, programmers tend to avoid recursion for various reasons. Some programming languages simply do not support recursion. Many programmers are reluctant to use recursion, either because they do not understand it sufficiently well (for example, many find out the hard way that failure to specify the basis case correctly can lead to catastrophic failures) or because the performance of the resulting software is frequently worse than they expected.

If the recursion at hand is tail recursion (see Chapter 3), its elimination is highly desirable, since recursion will unavoidably incur a space penalty for the allocation of the recursion stack, while the technique that allows us to replace tail recursion by an iterative process requires no recursion stack and no other data structure that would need additional space. Thus, replacing tail recursion by iteration materially improves the space complexity of the resulting program. Moreover, since the implementation of recursion in a programming environment is a fairly complicated process (even if the instance at hand may be simple, compilers tend to treat all types of recursion identically, which means the most complicated situation is adapted to sim-pler ones), the time complexity of the iterative version is usually also better.

2 To wit, the solution of the Towers of Hanoi problem (Section 3.1) cannot be understood without recursion. Insertion and deletion in AVL trees (Section 3.2) are only slightly less unintelligible without recursion.

C6730_C006.fm Page 142 Friday, August 11, 2006 9:21 AM

Implications of Compiler and Systems Issues for Software 143 If the recursion at hand is not tail recursion, the situation is much more complicated. While recursion can always be eliminated, this process involves the explicit administration of the information that would automatically be administered by the run-time support system in the recursion stack.³ While one may avoid allocating space for some of the items that are part of the recursion stack, this saving tends to be small. Thus, if the recursion cannot be replaced by iteration without additional data structures, it may be advis-able to stay with the recursive implementation. An added advantage is the increased readability and therefore maintainability of the software.

One significant aspect of recursion is the variability of the size of the recursion stack. Typically, it is not possible to predict the exact size that the stack can attain; in most cases, we can only derive an upper bound. For example, in QuickSort, a fundamentally recursive algorithm (although iter-ative versions exist, their space complexity is essentially the same as that of the recursive version), the upper bound for the worst-case space complexity is O(n) when sorting n numbers. This is because in the worst situation for QuickSort, the number of activation records on the recursion stack can be O(n). One may point out that the worst case occurs exceedingly rarely;

indeed, since the average case requires an amount of space proportional to the height of the tree structure that describes the recursive calls and since this height is O(log₂(n)) on average, one might be seduced into believing that O(log₂(n)) space is sufficient on average.

The problem with this argument is the following: Suppose we allocate a certain amount of memory for the use of QuickSort and assume this amount is well above the requirements for the average case of QuickSort. What will happen if one of those exceedingly rare, worse than average cases occurs?

If the memory is fixed (that is, the size of the allocated memory cannot be exceeded), the program simply crashes. This is not what one expects when looking primarily at the time complexity. One might perhaps expect that the program will take much longer than predicted but that it will nevertheless terminate correctly. If the program crashes for lack of memory, this expecta-tion is sadly wrong. However, if exceeding the allocated space results in the invocation of virtual memory management (VMM), the run time of the program, already very bad since we are considering a case with bad time complexity, may become dramatically worse because of the additional time required to do page transfers from and to disk.

It should now be clear that recursion can have major implications for the overall behavior of a program. To emphasize this point, let us reexamine the conventional wisdom about QuickSort. It is generally accepted that Quick-Sort is faster than any other sorting program on average. Thus, the notion persists that QuickSort should be used for applications that are not time critical. However, even if we concentrate on average case behavior, it is

3 At a minimum, copies of all parameters, copies of all local variables, and the reentry point (place in the program where execution is to resume when the recursive call returns) are stored in a separate activation record for each recursive call. This is true even if some or most of these data items are not necessary in a specific instance.

C6730_C006.fm Page 143 Friday, August 11, 2006 9:21 AM

144 A Programmer’s Companion to Algorithm Analysis

probably unacceptable that the program crashes occasionally.⁴ Therefore, it is highly advisable to allocate an amount of memory for the recursion stack that corresponds to the worst-case situation. We must prudently conclude, then, that QuickSort has a space complexity of O(n). This is no better than that of MergeSort, which has a comparable time complexity as QuickSort on average and a much better worst-case time complexity. It is also much worse than HeapSort. We summarize the various complexities of these three sorting algorithms in Table 6.1.

The upshot is that from a practical point of view, it is very difficult to see why anyone would want to use QuickSort to sort arbitrary sets of numbers.

QuickSort requires as much space as MergeSort (proportionally to n and absolutely actually more since the recursion stack requires more space per recursive call than one array element) and is slower for both average (slightly) and worst case (significantly), while HeapSort requires signifi-cantly less space than QuickSort and is signifisignifi-cantly faster in the worst case than QuickSort and only marginally slower on average.

In general, it is probably a bad idea to look at average space complexity when it comes to recursion, since failure to comply with the average may result in the program’s crash.⁵ In situations where average complexities are of interest (that is, no real-time, time-critical applications), it is far more sensible to assume the worst-case space complexity. If we are wrong with our average-case assumption for time, the program may take significantly longer than expected, but at least it will still terminate correctly. If we assume average space complexity, the program may instead crash, which is an unac-ceptable outcome, even if one were willing to accept that the program occa-sionally takes much longer than was predicted by the average time complexity of the underlying algorithm.

TABLE 6.1

Complexities of QuickSort, MergeSort, and HeapSort Space Complexity

Prudent

Time Complexity

Method Average Worst Case

QuickSort O(n) O(n^.log₂(n)) O(n²) MergeSort O(n) O(n^.log₂(n)) O(n^.log₂(n)) HeapSort O(1) O(n^.log₂(n)) O(n^.log₂(n))

4 Program crashes may have consequences that go beyond the termination of the program; they may affect the proper functioning of other processes as well. Thus, the argument that a very rare crash of a program can be handled by a programmer manually is not a good one. It is extremely bad programming practice to accept occasional crashes that could easily be avoided.

5 This assertion holds for space complexity in general, not just in conjunction with recursion. No program should be allocated less space than it needs in the worst case, since the alternative is that it may crash.

C6730_C006.fm Page 144 Friday, August 11, 2006 9:21 AM

Implications of Compiler and Systems Issues for Software 145

Dans le document A ProgrAmmer’s ComPAnion to Algorithm AnAlysis (Page 153-156)