• Aucun résultat trouvé

Part 1 The Algorithm Side: Regularity,

3 Examples of Complexity Analysis

3.2 Selected Examples: Determining the Complexity of

3.2.6 HeapSort

This is another in-place sorting method, meaning that no additional array is needed. HeapSort is often considered inferior to QuickSort, we think unfairly so. While its average time complexity is somewhat worse than that of QuickSort, its worst-case complexity is dramatically better. Moreover, it is inherently nonrecursive and has a much better space complexity. It is, however, a more complicated algorithm, which may be off-putting to some.

The key concept in HeapSort is the heap. This is an array23 A[lo:hi] of numbers with the property that for all i, lo ≤ i ≤ hi/2, A[i] ≥ A[2·i] and

C6730_C003.fm  Page 62  Friday, August 11, 2006  8:12 AM

Examples of Complexity Analysis 63 A[i] ≥ A[2·i + 1], as long as the array elements exist. Heaps have a number of interesting properties; one of them is that the first element, A[lo], of any heap A[lo:hi] is the maximum of all elements in the heap. Another is that heaps are surprisingly efficient to construct.

Here is how one can construct a heap A[lo:hi]. We first observe that the second half of the heap, A[(lo + hi)/2:hi], always complies with the heap condition. We then proceed to “insert” the remaining elements of the array one by one, starting with the one in position (lo + hi)/2 −1, then that in position (lo + hi)/2 − 2, and so on, until we get to the element in position lo. Our insertion process is such that after the insertion of the element in position i, the resulting array A[i:hi] satisfies the heap condition for all i = (lo + hi)/2 − 1,…, lo + 1,lo. Here is the first algorithm for the insertion of A[i] into A[i:hi], assuming A[i + 1:hi] satisfied the heap condition:

HeapInsert A[i:hi]

{ Compare A[i] with A[2·i] and A[2·i+1] if they exist;

if A[i] ≥ max{A[2·i],A[2·i+1]}, do nothing – insertion successfully completed

otherwise

if A[2·i]<A[2·i+1] then

{ interchange A[i] and A[2·i+1]; repeat insertion process with A[2·i+1] }

else { interchange A[i] and A[2·i]; repeat insertion process with A[2·i] }

}

Then we use this to construct an entire heap, given the array A[lo:hi]:

For i := (lo+hi)/2-1, (lo+hi)/2-2,…,lo do HeapInsert A[i:hi]

We apply this to the array A[1:n]. Once we have the heap A[1:n], we know that its first element, A[1], contains the maximal element of the heap, so we exchange it with the element in position n and insert the element that now sits in position 1 into the array A[1:n − 1] so that it becomes a heap. In general, we grow the sorted array from the back, making sure that the displaced element in position j is properly inserted in the shrinking heap A[1:j − 1], for j = n,n − 1,n − 2,…,2,1. Obviously, when j = 1, the remnant heap is of size 1 and the remainder of the array, A[2:n], contains all elements larger than A[1] in sorted order. Now we can formulate the HeapSort algorithm:

23 Heaps can be formulated using binary trees as well, in which case the two children of the node corresponding to A[i] correspond to A[2·i] and A[2·i + 1]. It follows from this that the binary tree corresponding to a heap is always a complete binary tree, in which only the last row of nodes may be incomplete (more specifically, the last row of nodes is contiguous, except the nodes on the right may be missing). It follows therefore that a tree representing a heap with n nodes is exactly of height log2(n + 1) − 1, which is bounded from above by log2(n).

C6730_C003.fm  Page 63  Friday, August 11, 2006  8:12 AM

64 A Programmer’s Companion to Algorithm Analysis HeapSort (A[1:n])

{ Construct a heap: for i := (1+n)/2-1,…,1 insert A[i] into the heap A[i:n].

For j:=n,n-1,…,2,1 do {

interchange A[j] and A[1];

insert the element in position 1 into the heap A[1:j-1]

} }

We now come to the complexity analysis of HeapSort. We first look at HeapInsert. Given a heap A[j:hi] with m = hi − j + 1 elements, there are at most log2(m) iterated insertions (insertions that are the consequence of extracting the larger of A[2j] and A[2j + 1] and interchanging it with A[j]).

Thus, the complexity of HeapInsert A[j:hi] is O(log2(hi − j + 1)). When constructing the heap A[1:n], this complexity is bounded from above by O(log2(n)), and since there are only about n/2 elements that must be pro-cessed using HeapInsert, the complexity of constructing the heap A[1:n] is O(n·log2(n)).24 Then we must examine the process of taking the element in position 1 (which is always the maximum of the remnant heap A[1:j]) and exchanging it with that in position j; each of these operations entails one HeapInsert with a heap of size n − j. Again, we bound the complexity of this operation by O(log2(n)), and since there are n elements altogether, this part of HeapSort requires O(n·log2(n)) time. Thus, the total time (word) complex-ity of HeapSort is O(n·log2(n)).

This analysis is for the worst case that can occur. However, it should be clear that any gap between worst-case and best-case analysis is minimal. The first part of HeapSort (the construction of a heap) can be done in linear time (see footnote 25) and this is the same for worst case and best case; the second part consists of inserting elements that necessarily will be small (loosely speaking, in any heap, the last positions contain the small elements, the first positions, the large elements). This implies that there will be numerous interchanges, indicating that the upper bound of O(log2(n)) for the insertion operations is close to optimal. Since the average complexity will always fall between the best-case and the worst-case complexity (regardless of how we define average), HeapSort’s complexity is O(n·log2(n)) in all cases — average, worst, and best.

The space complexity of HeapSort is much easier to determine. There is no recursion, and the iterations that occur require only simple loop variables.

24 It turns out that we are overly generous in this analysis; a finer analysis shows that the com-plexity of constructing a heap with n elements is only O(n). This is because about n/2 elements need not be inserted at all (the half with the higher position numbers), so n/4 elements need at most one interchange, n/8 need at most 2, and in general, n/2 r elements need at most r − 1 inter-changes (r = 1,…,log2(n) − 1) when doing the repeated insertions in the heap. Summing this up yields O(n) (see the first footnote on page 12 of Chapter 1). However, since the second part of HeapSort requires time greater than linear (otherwise it would be a contradiction to the lower bound we derived in Section 1.9), it is not particularly urgent to carry out the most exacting ana-lysis for the construction of a heap.

C6730_C003.fm  Page 64  Friday, August 11, 2006  8:12 AM

Examples of Complexity Analysis 65 Also, no other additional space is needed to carry out the algorithm, except for interchanging elements, which can be done in constant space. Therefore, the space complexity of HeapSort is constant. This should be contrasted with MergeSort and QuickSort, which require significantly greater space complexity.