RadixSort - Selected Examples: Determining the Complexity of

Part 1 The Algorithm Side: Regularity,

3 Examples of Complexity Analysis

3.2 Selected Examples: Determining the Complexity of

3.2.7 RadixSort

This is the odd man out in our list of sorting methods. It does not use any comparisons at all and it seems to contradict the lower bound we derived in Section 1.9. Furthermore, its performance is usually measured in word complexity, but the algorithm itself is distinctly bit- or digit-oriented. Spe-cifically, it uses the positional representation of integers and relies on the fact that there is a fixed number D of digits; in the case of binary numbers, we have D = 2; in the case of decimal numbers, D = 10.²⁵ Conceptually, in RadixSort there exists a “bucket” that can accommodate numbers for each digit (it is also known as BucketSort for this reason) . If the numbers involved have m digits, with position m the least significant and position 1 the most significant, RadixSort proceeds as follows:

Initially in stage m, we have the given n numbers N₁, …, N_n, each with m digits.

for j:=m,m-1,…,2,1 do

{ for each number N in the order as listed in Stage j, examine its j^th digit q and place N into the bucket with number q;

Create Stage j-1 by sequencing the numbers as follows: first all the numbers in Bucket 0, then all the numbers in Bucket 1, etc., until last come all the numbers in Bucket D-1.

}

Stage 0 contains the given n numbers sorted in ascending order.

To determine the time complexity, we observe that in each stage we exam-ine each number and place the number into the bucket indicated by a digit of the number; this operation requires constant time for each number, pro-vided we manipulate pointers to the numbers rather than the entire numbers themselves. Therefore, the work to be carried out in one stage is O(n). There are m stages; consequently, the overall time (word) complexity is O(m·n).

There is no difference between worst-case, best-case, or average time com-plexity. The space complexity is O(D), provided we do not count the pointer to the numbers (if we do consider the space for the pointers, the space complexity is O(max{D,n})).

25 RadixSort can also be applied to words over an alphabet A, in which case the letters in the underlying alphabet A take the place of the digits. However, while words are sorted in a similar way as numbers, in that the first letter or digit is most significant for the order and the last letter or digit is least significant, words tend to have differing lengths, while we assume here that all numbers have the same number of digits. This would be appropriate for numbers represented by a fixed number of bytes, which is the paradigm of word complexity.

C6730_C003.fm Page 65 Friday, August 11, 2006 8:12 AM

66 A Programmer’s Companion to Algorithm Analysis Let us examine the time complexity more closely. Since m is fixed, O(m·n) is faster than O(n·log₂(n)); therefore, RadixSort appears to contradict the lower bound on sorting we derived in Section 1.9. There are two responses to this. The first one is of a technical nature: The lower bound in Section 1.9 was on the number of comparisons when sorting by comparing two numbers.

This, however, is not how RadixSort sorts. There is not a single comparison in the entire algorithm and therefore the lower bound cannot be applied at all. However, in a way this answer misses the point. Implicitly, we took the lower bound in Section 1.9 to be much more universal; we really assumed that the O(n·log₂(n)) time complexity implied by the comparison argument was applicable to all sorting methods. While this is technically incorrect (the argument applied only to comparisons), RadixSort nevertheless does not provide a counterexample. This can be seen as follows.

Recall that m is the number of digits. In Section 1.4, when we discussed bit and word complexity, we made the argument that it is inappropriate to assume that the number of bits (or digits) is independent of n, the number of elements to be sorted. Specifically, we indicated that in order to represent n different elements, we need at least log2(n) bits; this generalizes for numbers in base D to logD(n) digits. Thus, m and n are not independent;

m must always be at least logD(n). This brings us back to the lower bound;

given the relationship between m and n, it now follows that RadixSort has a complexity that is very much like O(n·log₂(n)).

The next group of problems we examine centers around searching. There are two different aspects of this notion. We may search for a given element in a collection of elements or we may have some kind of index and want to retrieve the element with that index. Typical examples are looking up a word in a dictionary (first aspect) and finding the smallest number of a set of numbers (second aspect).

Searching for a given element x in a set of elements is interesting from the point of view of complexity, because the problem displays very dif-ferent behaviors depending on how the set is represented. Let us assume that the elements are contained in an array A[1:n]. In Section 1.3 we analyzed one facet of this problem, namely the case where the array is not sorted.²⁶ As we pointed out there, in this case a significant gap arises between average and worst-case complexity (using most realistic defini-tions of average). Still, searching for a given element in an unordered array containing n elements requires O(n) time. This linear or sequential search is very unattractive, especially if several such searches are to be per-formed. The situation changes dramatically if the array is sorted.

26 It is true that our analysis in Section 1.3 assumed a linear list instead of an array, but for the approach we examined, the random access property of the array does not imply any advantage over the sequential access associated with a linear list. Therefore, we use an array to preserve the uniformity of our presentation. Furthermore, when we turn to binary search, the random access property of the set representation is indispensable.

C6730_C003.fm Page 66 Friday, August 11, 2006 8:12 AM

Examples of Complexity Analysis 67 3.2.8 Binary Search

Assume that the array A[1:n] containing n elements is sorted; our task is to determine whether a given element x is in A, and if so, what its index is.

The key idea of binary search is the following: Given a search space A[lo:hi], we first determine the element in the middle, namely A[m] with m = (hi − lo + 1)/2 (if hi − lo + 1 is odd, use one of the two adjacent integers); if A[m]

= x, then we have found x in location m; otherwise we repeat our search in the smaller search space A[lo:m − 1] if A[m] < x or A[m + 1:hi] if A[m] > x.

Termination then is achieved either if x is found in a specific position (suc-cessful search) or if the search space is so small that x cannot be in it, that is, if lo > hi (unsuccessful search).

The complexity of this algorithm is directly related to the number of search space reductions that can occur when starting with the search space A[1:n].

Each such reduction cuts the size of the search space in half; thus, no more than log2(n) such reductions (halvings) can occur before the search space size is less than 1 (which means that x cannot be found in it). Since the amount of work required to carry out one reduction is constant, it follows that binary search requires O(log₂(n)) time. While this is the worst-case complexity, the average time complexity is about the same. It is especially interesting to note that binary search retains its efficiency if the search is unsuccessful. The space complexity of binary search is O(1) since we only need space to keep track of the upper and lower bounds of the current search space. Note that no recursion is required; the search is entirely iterative.

It is important to realize the enormous improvement in the performance of searching for an element that is caused by the assumption of order. If A[1:n] is sorted, the time complexity is O(log₂(n)), but if it is not, it is expo-nentially slower, namely O(n). Thus, if one is to do several searches on the same data set, very often it pays to invest in sorting the array A[1:n] first, an unproductive activity with a high cost of O(n·log₂(n)), and then do s searches at a cost of O(log₂(n)) each. Contrast this with the cost of s searches based on an unsorted array. Clearly, sorting is more economical if

O(n·log₂(n)) + s·O(log₂(n)) < s·O(n).

This is precisely the case for all s > O(log2(n)).

Let us now turn to the other aspect of searching, namely finding an element with a given index in a list of numbers.²⁷ Specifically, assume we are given an array A[1:n] and an integer K with 1 ≤ K ≤ n. Our task consists of

27 Strictly speaking, it can be applied to any type of element that has a total ordering. In other words, given two such elements a and b, with a ≠ b, either a precedes b or b precedes a. Numbers have this property, as do words using the lexicographical ordering. Subsets of a set S do not; if S

= {x,y,z} and a = {x,y} and b = {x,z}, then a does not contain b and b does not contain a. Note that linear search requires only testing for equality, while binary search requires a total order as well, because otherwise the array could not be sorted. Since finding the Kth largest element does not rely explicitly on sorting, it is prudent to point out that a total order is nevertheless required.

C6730_C003.fm Page 67 Friday, August 11, 2006 8:12 AM

68 A Programmer’s Companion to Algorithm Analysis determining the Kth-largest of these n elements. There are certain specific values of K for which this problem is particularly important. If K = 1, we want to find the maximum (which we have already discussed). For K = 2, we might first find the maximum and then the maximum of the remaining elements. If K is large, this process is not particularly attractive. For example, if K = O(n),²⁸ it would lead to an algorithm with a time complexity of O(n²),²⁹ a truly awful performance since we could just sort the array (in time O(n·log₂(n)) and then access the Kth-largest element in constant time (since the array permits direct access to the element with index n − K + 1 and accessing the element with that index, by virtue of the array’s random access property requires O(1) time). It is therefore interesting the see that solving our problem can be done much more efficiently than resorting to sorting.

3.2.9 Finding the Kth Largest Element

Consider the following recursive approach; note that A is not assumed to be sorted:

Select(A[1:n],K)

1. Randomly choose a pivot element m.

2. Use m to construct the sets L, E, and G of those elements that are strictly smaller, equal, and strictly greater than m, respectively:

For i:=1,…,n do

if A[i]=m then add A[i] to E

else if A[i]<m then add A[i] to L else add A[i] to G

During the construction of L, E, and G, also count their elements c_L, c_E, and c_G, respectively.

3. If c_G ≥ K, then return Select (G,K).

else if c_G+c_E ≥ K then return m else return Select (L,K-(c_G+c_E))

This algorithm splits the search space A[1:n] around the pivot element m and then determines the three sets L, E, and G. If there are at least K elements in G, then we call Select recursively to determine the Kth largest element in G. Otherwise we see whether G∪E contains at least K elements; if so, m is

28 A very common value is K = n/2, in which case the problem is finding the median of the set.

Informally, the median of a set is the element with the property that half of the elements of the set are larger and half of the elements are smaller than the median.

29 We would spend O(t) time to find the maximum of an array of size t; first t = n (find the max-imum of the entire set), then t = n − 1 (find the maxmax-imum of the remaining n − 1 elements), and so on, until t = n − K + 1. Summing this work up yields a quadratic time complexity.

C6730_C003.fm Page 68 Friday, August 11, 2006 8:12 AM

Examples of Complexity Analysis 69 the desired Kth largest element. Finally, if none of these cases applies, we call Select recursively again, but now with the set L, and instead of finding its Kth largest element, we find L’s element with the number K − (cG + c_E), reflecting that we removed G and E from the search space and therefore K has to be reduced accordingly by the number of the removed elements in G and E.

The most important factor in the determination of the complexity is the recursive calls to Select, and more specifically, the size of the sets involved in these calls. Initially, our search space size is n; in the single recursive call in step 3, the search space size is either c_G or c_L (note that there is no more than one recursive call, as at most one of the two cases with recursion can apply). It is not difficult to see that the worst case occurs if max{c_G,c_L} = n − 1. If this occurs in every recursive call, we have the situation of Scenario 1 of Section 3.1, with the proviso that the additional work (namely the con-struction of the sets L, E, and G) takes O(n) time; consequently, the worst-case time complexity is a pathetic O(n²).³⁰

What would be a desirable situation? Recall that in binary search, the search space was split into two equal halves. Can we achieve something similar here? Suppose m were the median of the search space; then we would mirror binary search, except for step 3, which is concerned with keeping track of the index K. Of course, we do not know how to get m to be the median,³¹ but we can get very close. Here is how.

Replace step 1 in Select with the following steps:

1.1 Split the search space A[1:n] into groups of five elements each and sort each of these sets.

1.2 Determine M to be the set of all the medians of these five-element sets.

1.3 m:=Select(M,Èn/10˘).

We will refer to steps 1.1, 1.2, 1.3, 2, and 3 as the modified Select algorithm.

While it is clear that any choice of m will work, and hence the m determined in steps 1.1, 1.2, and 1.3 will also work, it is not so clear what this convoluted construction buys us. Step 1.1 groups the search space into five-element sets and sorts each of them. This requires O(n) time, since the grouping operation implies one scan of the search space and sorting five elements can be done in a constant number of comparisons (seven comparisons are sufficient to sort five numbers). Also, we can incorporate step 1.2 into this process — just take the middle element of each five-element (by now sorted) set and add it to M. How large is M? Since we have about n/5 five-element sets and we

30 To see this directly, consider that in the first call, we need O(n) time. In the second call, we need O(n − 1) time. In general, in the ith call, we need O(n − i + 1) time, for i = 1, …, n. Summing this up yields the claim of O(n²).

31 The best way we know at this point of finding the median is to find the Kth largest element with K = n/2. Since we are still struggling with this problem, looking for the median is not exactly promising.

C6730_C003.fm Page 69 Friday, August 11, 2006 8:12 AM

70 A Programmer’s Companion to Algorithm Analysis take one element from each, M has about n/5 elements. Step 1.3 then consists of determining, again recursively, the median of the set M, that is, the median of the medians of the five-element sets.³²

With this specific choice of m, let us revisit the question of how large are the sets L and G determined in step 2. Let us first determine which elements cannot possibly be in L. Clearly, m cannot be in L, and since m is the median of M, half of the elements in M are ≥ m, so they cannot be in L either.

Moreover, each of these elements is the median of its five-element set, so in each such set there are two more elements that are ≥ m, namely the elements greater than or equal to its set’s median. Summing all this up, we reach the conclusion that there are at least n/10 + 2·n/10, or 3·n/10, elements ≥ m;

therefore, none of them can be in L, and hence c_L cannot be larger than 7·n/

10. By a similar argument, one sees that there are at least 3·n/10 elements in A[1:n] that are ≤ m, so none of them can be in G, and hence cG ≤ 7·n/10.

It follows therefore that in step 3, the search space for either of the two recursive calls is no larger than 7·n/10.

Let T(n) be the time the modified Select algorithm requires for a search space with n elements. Then the recursive call in step 1.3 requires time T(n/5), and the (single) recursive call in step 1.3 requires time at most T(7·n/10). Since 7·n/10 < 3·n/4, we can bound T(7·n/10) from above by T(3·n/4). (Clearly, T is monotonically increasing, that is, if s < t, then T(s) < T(t).) Our final expres-sion for T(n) is therefore

T(n) ≤ T(n/5) + T(3·n/4) + C·n,

where C·n reflects the work to be done in steps 1.1, 1.2, and 2. It follows now that

T(n) = 20·C·n

satisfies this relation. The worst-case time complexity of the modified Select algorithm for finding the Kth largest element in the set A[1:n] of n elements is therefore O(n).³³ Furthermore, the space complexity is O[log₂(n)] since the recursive calls require space proportional to the depth of the recursion. Since in the worst case, the modified Select algorithm reduces the search space size by a factor of 7/10, such a reduction can take place at most log10/7(n)

times before the search space is reduced to nothing, and that is O(log₂(n))

32 In modified Select, we used five-element sets. There is nothing magic about the number 5; any odd number (>4) would do (it should be odd to keep the arguments for L and G symmetric).

However, 5 turns out to be most effective for the analysis.

33 We assume a suitable termination condition for Select; typically something like: if n ≤ 50 then sort directly and return the desired element. (Here, 50 plays the role of n₀, a value of n so small that it has little significance for the asymptotic complexity of how one deals with the cases where n ≤ n₀.) Under this assumption, we write T(n) = T(n/5) + T(3·n/4) + C·n and show by direct sub-stitution that T(n) = 20·C·n satisfies the equality. On the right-hand side we get 20·C·n/5 + 20·C·3·n/4 + C·n, which adds up to exactly 20·C·n.

C6730_C003.fm Page 70 Friday, August 11, 2006 8:12 AM

Examples of Complexity Analysis 71 times. Hence, the worst-case space complexity of modified Select is O(log₂(n)).³⁴

In comparing the initial version and the modified version of Select, one is struck by the fact that the modification, even though far more complicated, has a guaranteed worst-case complexity far superior to that of the original.

However, what this statement hides is the enormous constant that afflicts its linear complexity. On average, the original Select tends to be much faster, even though one does run the risk of a truly horrible worst-case complexity.

It is tempting to improve QuickSort by employing modified Select to determine its pivot. Specifically, we might want to use modified select to determine the pivot x as the median of the subarray to be sorted. While this would certainly guarantee that the worst-case complexity equals the best-case complexity of the resulting version of QuickSort, it would also convert QuickSort into something like ComatoseSort. The constant factor attached to such a sort would make it at least one, if not two, orders of magnitude slower than HeapSort.

So far, all the algorithms we have explored were essentially off-line. In other words, we expected the input to be completely specified before we started any work toward a solution. This may not be the most practicable approach to some problems, especially to problems where the underlying data sets are not static, but change dynamically. For example, an on-line telephone directory should be up to date: It should reflect at any time all current subscribers and should purge former customers or entries. This requires the continual ability to update the directory. Were we to use an unsorted linear list to represent the directory, adding to the list would be cheap, but looking up a number would be a prohibitive O(n) if there were n entries. Thus, we would like to be able to use a scheme that is at least as efficient as binary search — which requires the array to be sorted. However, it would be clearly impractical to sort the entire array whenever we want to

Dans le document A ProgrAmmer’s ComPAnion to Algorithm AnAlysis (Page 78-0)