Conclusion - The Algorithm Side: Regularity,

Part 1 The Algorithm Side: Regularity,

1.10 Conclusion

We described several performance aspects of an algorithm and illustrated most of them. We contrasted time and space complexity, the most important performance measures of algorithms. Entirely orthogonal²⁹ to these concepts, we distinguished worst-case, best-case, and average complexities and indi-cated when and why each of these concepts may be useful in practice. Yet again orthogonal, we discussed bit and word complexities. The I/O com-plexity of an algorithm, although introduced in this chapter, will be revisited in Part 2 in more detail, as it provides the basis for a performance measure that takes into account nonuniformity of memory accesses. On-line and off-line algorithms were contrasted, although we will primarily cover off-off-line algorithms. Finally, we emphasized the significance of lower bounds; it is only through their use that we can obtain an objective indication of whether an algorithm is really good.

Bibliographical Notes

Most of the material covered in this chapter is standard algorithm analysis and as such, it is presented in virtually all good algorithm books. Historically, Knuth’s The Art of Computer Programming delineates the starting point for much of this; Aho, Hopcroft, and Ullman: The Design and Analysis of Computer Algo-rithms is another classic. A more recent book is Kleinberg and Tardos: Algorithm Design (very comprehensive, written at a fairly high level). Also useful are Kingston: Algorithms and Data Structures, Design, Correctness, Analysis (not as advanced or as comprehensive as Kleinberg and Tardos): Purdom and Brown:

The Analysis of Algorithms; as well as Gonnet: Handbook of Algorithms and Data Structures; and Levitin: Introduction to the Design and Analysis of Algorithms. By and large, this selection is a matter of personal taste. Numerous textbooks, written at varying levels, convey the majority of the material in this chapter.

Readers should choose the one they feel most comfortable with.

Not covered in most textbooks is I/O complexity. The seminar paper here is McKellar and Coffman, 1969, “Organizing Matrices and Matrix Opera-tions for Paged Memory Systems”. Chapter 7 of Leiss: Parallel and Vector Computing, A Practical Introduction, gives an overview of I/O complexity and I/O management. This book also contains some comments about par-allel complexity.

29 We consider a concept orthogonal to another one if the two are independent of each other. We can talk about worst-case or average time complexity; we can talk about worst-case or average space complexity; we can throw bit and word complexity into the mix and have three indepen-dent dimensions to manipulate.

C6730_C001.fm Page 30 Friday, August 11, 2006 7:35 AM

A Taxonomy of Algorithmic Complexity 31

Exercises Exercise 1

Statement counts of entire algorithms or programs are composed of state-ment counts of individual program statestate-ments. This exercise addresses the constructive aspects of such a process.

For each of the following program statements, determine the best case and the worst case complexity, assuming that each simple instruction si (condi-tion, assignment) takes one unit of time. Give your answer as an interval [best, worst] in each case.

a. Straight-line code: si₁;…;si_n

b. Conditional: if cond then list1 else list2, where cond is a simple condition and listi is a list of ni simple instructions, i = 1,2

c. For-loop: for i:=k to l by m do list, where list is a list of n simple instructions

d. While-loop:

Qu:= [q₁; … q_k]; We assume that the elements qi are all taken from a universal set U, consisting of n elements.

while Qu not empty do

{ remove the front element q of Qu;

compute a new element p in U, based on q;

if p has not yet been considered, append p to Qu }

One must make assumptions about two aspects: the amount of work required to compute p when given q and the test of whether p had been already considered. The first is entirely arbitrary, say N simple instructions, but the second is not. Since the universal set has n elements, the most effective way (assuming n is of manageable size) is to allocate a boolean array AU[1:n]

that records whether item qi has been considered by setting AU[i] to true (AU is initialized to false). This operation must be factored into the deter-mination of the statement count for this code fragment.

e. Once these basis cases are established, we can combine them into more complicated statements. For example, consider:

si1; if c₁ then {si₂;si₃} else if c₂ then {si₄;si₅;si₆} else si₇; for i:= k to l by m do if c3 then {si8;si9} else {si10;si11;si12}

C6730_C001.fm Page 31 Friday, August 11, 2006 7:35 AM

32 A Programmer’s Companion to Algorithm Analysis Exercise 2

Consider the instructions in Exercise 1, but now determine the average complexity. This requires making assumptions about the likelihood of certain conditions to hold true. Note that for any assumption, your answer must lie within the interval [best, worst]. Assume that:

a. Each condition has a 50% chance of being true.

b. Each condition has a 25% chance of being true.

c. Each condition has a probability of 1/n of being true.

Exercise 3

Consider the following statement counts, expressed as functions of the pos-itive integer parameter n:

f₁(n) = n² + 5n + 10

f₂(n) = [f₁(n)/log₂(n)] ^.[n ^.log₂(n) + 3n – 2]

f₃(n) = f₂(n)/f₁(n)

f₄(n) = [n + log₂(n)] ^.f₃(n) f₅(n) = f₄(n)/log₂(n) f₆(n) = f₄(n)/n

a. Determine for each of these six functions f_i the most appropriate complexity class ϕj, j∈{1,…,8}. Also, determine whether fi ≡ ϕj for that complexity class.

b. Show that the following assertions are all false: f1(n) ≡ n; f2(n) ≡ n²; f4(n) ≡ n; f5(n) ≡ n.

Exercise 4

Formulate an algorithm and determine its best-case and worst-case com-plexities for the following problems:

a. Find the third largest of a set of n (≥ 3) numbers.

b. Find the first instance of an element that occurs at least three times in a sorted linear list with n elements.

c. Find the first instance of an element that occurs at least three times in an unsorted linear list with n elements.

C6730_C001.fm Page 32 Friday, August 11, 2006 7:35 AM

A Taxonomy of Algorithmic Complexity 33 d. Find the first instance of an element that occurs exactly three times

in a sorted linear list with n elements.

e. Find the first instance of an element that occurs exactly three times in an unsorted linear list with n elements.

Exercise 5

Consider the questions in Exercise 4, but determine the average complexity under the following assumption: The elements in the linear list are all taken from a universal set with N elements, and the likelihood of an element being in any location in the list is 1/N. Note that your answers will now depend not only on the number n of list elements, but also on N.

Exercise 6

Assume each block is of size 256 words, the active memory set size is 64, and the replacement strategy is pure LRU. Also assume that each array is mapped contiguously into the memory space and the first array element is the first element in its block. For each of the code fragments below, determine the number of blocks transferred between main memory and disk:

a. for i:=1 to 65536 do A[i]:=A[65537-i]*A[i]

(assuming the array A is of type [1:65536]) b. for i:=1 to 1024 do

for j:=1 to 1024 do C[i,j] := A[i,j] + B[i,j]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in column-major order)

c. for i:=1 to 1024 do

for j:=1 to 1024 do C[i,j] := A[i,j] + B[i,j]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in row-major order)

d. for i:=1 to 1024 do

for j:=1 to 1024 do C[i,j] := 0.0;

for i:=1 to 1024 do for j:=1 to 1024 do

for k:=1 to 1024 do

C[i,j] := C[i,j] + A[i,k]*B[k,j]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in column-major order)

e. for i:=1 to 1024 do

for j:=1 to 1024 do C[i,j] := 0.0;

C6730_C001.fm Page 33 Friday, August 11, 2006 7:35 AM

34 A Programmer’s Companion to Algorithm Analysis for i:=1 to 1024 do

for j:=1 to 1024 do for k:=1 to 1024 do

C[i,j] := C[i,j] + A[i,k]*B[k,j]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in row-major order)

f. for i:=1 to 1024 do for j:=1 to 1024 do

C[i,j] := C[j,i] + A[j,i]*B[j,i]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in column-major order)

g. for i:=1 to 1024 do for j:=1 to 1024 do

C[i,j] := C[j,i] + A[j,i]*B[j,i]

(assuming the [1:1024,1:1024] arrays A, B, and C are mapped in row-major order)

Exercise 7

Determine for the following code how many pages are transferred between disk and main memory, assuming each page has 1024 words, the active memory set size is 300 (i.e., at any time no more than 300 pages may be in main memory), and the replacement strategy is LRU (the least recently used page is always replaced). Also assume that all 2D arrays are of size [1:1024, 1:1024], with each array element occupying one word, provided the [1:1024,1:1024] arrays A, B, and C are mapped into the main memory space:

(a) in row-major order and (b) in column-major order:

for i := 1 to 1024 do for j :=1 to 1024 do

{ A[i,j]:=A[i,j]*B[i,j]; B[i,j]:=C[N-i+1,j]*A[i,j] }

Exercise 8

Reexamine the following algorithms that we analyzed using word complex-ity and determine their bit complexcomplex-ity, assuming that each element involved has m bits. Pay attention to the fact that operations such as comparing two elements and adding or multiplying two numbers no longer take O(1) time, but that the time now depends on m.

a. Determining the largest of n elements (Section 1.2)

C6730_C001.fm Page 34 Friday, August 11, 2006 7:35 AM

A Taxonomy of Algorithmic Complexity 35 b. The two scenarios of finding an element in a linear list, depending

on probability assumptions (Section 1.3) c. Matrix multiplication of two [1:n,1:n] matrices

Exercise 9

Determine a lower bound on sorting n m-bit numbers by comparisons using bit complexity.

Exercise 10

Formulate a comprehensive algorithm that implements the argument made at the beginning of Section 1.9 to improve the computation of the stencil discussed at length in Section 1.6. Specifically, outline how under the stated assumption about the amount of available memory, the blocks should be sized and how the strategy for retrieving and storing back blocks is to be implemented. Then carefully analyze the number of block transfers, keeping in mind that only dirty blocks (blocks that have been written to since they were fetched from disk) need to be written back before they are replaced by other blocks.

Exercise 11

In the stencil example in Section 1.6, we assumed that there was a new matrix M' that we had to compute. It is frequently not necessary to have a second matrix. It might be acceptable to compute the result of applying the stencil in place, that is, using the same matrix M to store the new values. This creates problems since we must ensure that the old values of M, not the new ones, are used in the computations of the stencil. Thus, some temporary space must be allocated for this purpose, even though we do not need an entire matrix M' for this.

Formulate an algorithm to incorporate this idea and determine its I/O complexity, along the lines of the argument advanced in Section 1.6.

C6730_C001.fm Page 35 Friday, August 11, 2006 7:35 AM

2

Fundamental Assumptions Underlying Algorithmic Complexity

About This Chapter

In this chapter we formulate explicitly the assumptions underlying the com-plexity analysis introduced in the previous chapter. We discuss their impli-cations and show that their effect is a significant simplification of determining desired performance measures of an algorithm. Many of the assumptions relate to some form of uniformity, be it uniformity in the way operations are counted, uniformity in accessing memory, or uniformity in the validity of mathematical identities. We also reexamine the asymptotic nature of the functions that result from determining complexities. While most of these aspects appear fairly innocuous, their discussion sets up the exploration in Part 2 of whether these assumptions remain valid when designing software based on the analyzed algorithms.

2.1 Introduction

In the previous chapter we established a conceptual framework for analyzing the performance of algorithms. In doing so we sidestepped several important issues and assumptions that are vital for the relative ease with which we manage to carry out this process. It is now appropriate to examine these assumptions in greater detail.

C6730_C002.fm Page 37 Monday, July 3, 2006 10:40 AM

38 A Programmer’s Companion to Algorithm Analysis

2.2 Assumptions Inherent in the Determination of Statement Counts

The first leap of faith we had to make when developing the theory of operation or statement counts had to do with the assertion that all statements are comparable in complexity. This obscured a number of rather thorny issues, which we attempt to clarify here. First at issue is the question of what operations can be considered atomic. Closely related is the area of memory access, in particular, its random access property that is implicitly assumed whenever we deal with algorithms.

At the heart of the assumptions of this section is the equivalence of atomic operations and statements. Recall that our treatment in Chapter 1 suggested that a statement essentially consists of no more than a constant number of atomic operations. Since the asymptotic nature of our performance measures allows us to hide constant factors, the fact that one statement may consist of several atomic operations may be conveniently swept under the rug — provided we can ascertain that the number of operations involved in a statement is indeed a constant; that is, it must be independent of the data structure to which the operations are applied. This is neither obvious, nor is it always true. Therefore, we must delve a bit deeper into this question.

First we must clarify what we mean by atomic operation. We have already obfuscated a bit by introducing two notions of complexity: bit and word complexity. An atomic operation in bit complexity is simply an operation that involves a single bit of each of its operands. Note that we usually assume that operations are binary, so there would be two bits involved, one from each of the two operands. However, there are also unary operations (for instance negation) as well as operations with more than two operands. At any rate, an atomic operation (in either bit or word complexity) can have only a fixed constant number of operands. An analogous definition applies to word complexity, but now the operation applies to a word rather than a single bit of each operand. As indicated, this is somewhat confused because the word length is not necessarily fixed across different architectures. On the one hand, there are 16-bit words, 32-bit words, and 64-bit words in different architectures; on the other hand, by its very nature, word complexity will assume that the word is long enough to accommodate whatever space is needed for a given data item, say an integer or a real number.

As we pointed out in Section 1.4, we need at least log₂(n) bits to represent n different numbers, but in word complexity, the space for such a number is simply considered one word, and an atomic operation on such words is assumed to take one unit of time. This is where the two complexity measures differ; if we have words of length m, an operation such as comparison of two words (numbers) takes one unit of time using word complexity, but m units of time using bit complexity. For other operations, such as multiplica-tion of numbers, the difference is even greater. Thus, an operamultiplica-tion that would

C6730_C002.fm Page 38 Monday, July 3, 2006 10:40 AM

Fundamental Assumptions Underlying Algorithmic Complexity 39 be considered atomic within the context of word complexity might not be viewed as atomic under the rules of bit complexity. Therefore, it is very important to be aware of the context (bit or word complexity), as the atom-icity of an operation depends on it.

It is instructive to contrast this with the mathematical operation of adding two-dimensional (2D) matrices. This operation is not atomic under the rules of either bit or word complexity. Ultimately, the reason is that the size of the matrices affects the amount of work required to carry out this computation.

Clearly, if the two matrices are [2,2], less work is required to add them than if they are of size [1000,1000]. In general, if the two matrices are of size [1:n,1:n], we need n² additions of two scalar numbers, so even under word complexity rules, the time complexity is O(n²). Under bit complexity rules, the length of the scalars must also be considered. Assuming it is m (and ordinarily m > log₂(n), as there are 2·n² scalars¹), the time complexity becomes O(m·n²). Note that m here is not a constant that can be hidden in the order-of notation, simply because it is not a constant — it generally depends on n (increasing n requires increasing m).

Now we are ready to tackle statements. As long as a statement contains only a fixed number of atomic operations, the equivalence (up to a constant factor) of operations and statements is valid. This applies to both bit and word complexity since it hinges on the atomicity of the operations (which depends on the context). Typical statements might be

X := 2*X + 1,

where X is a scalar (valid for word or bit complexity, as long as X is a scalar within the word or bit context) or

C[i,j] := C[i,j] + A[i,k]*B[k,j],

where A, B, and C are 2D matrices (of arbitrary size; for bit complexity with bit matrices and for word complexity with word matrices). The first state-ment involves one multiplication, one addition, and one assignstate-ment. As we pointed out, we consider atomic operations to be comparable as far as their time requirements are concerned. This is a reasonable assumption because in virtually all computer architectures, the time for a scalar multiplication is only a few times, perhaps five times, longer than the time of an addition.² In general, one may assume that the basic arithmetic operations are compa-rable; that is, the effort required to do the slowest is only a small constant times the effort to do the fastest.

1 Boolean matrices are frequently represented as integer matrices. They would be an exception to this rule of m > log₂(n). Boolean matrices are, for example, used to represent graphs.

2 This uses the fact that the word length is limited to 16, or 32, or 64 bits. This is of course true for all of today’s commercial architectures (as of 2005). However, this statement would no longer be valid if arbitrarily long words were supported by a specific architecture.

C6730_C002.fm Page 39 Monday, July 3, 2006 10:40 AM

40 A Programmer’s Companion to Algorithm Analysis Assignment is much more complicated. The converse of assignment, retrieval, is equally thorny. The issue is one of access to memory, whether we want to retrieve data or store data. Ordinarily, this issue is avoided since we concentrate on operations, without worrying where the values come from or where the results are stored. Underlying this lack of concern is the fun-damental assumption of algorithm analysis that memory accesses are simple, cheap, and fast. Consequently, one invariably assumes that retrieval of argu-ments and storing of results can be subsumed in the time required to carry out the operations at hand; in other words, retrieving and storing is consid-ered equivalent to carrying out an atomic operation (under either bit or word rules). This is an assumption that bears careful examination.

We will distinguish between access to simple variables, such as the variable X in the first statement above, and access to elements of more complicated structures, such as the array elements in the second statement. To simplify the presentation, let us assume that we are considering word complexity only.

In virtually all commercial computer architectures, access to a unit of memory takes an amount of time that is comparable to (i.e., within a rela-tively small constant factor of) the time required to carry out an atomic arithmetic operation, provided the unit resides in main memory. This holds because main memory possesses the random access property (RAP). This means that any unit in main memory has a unique index, that specifying this index provides access to the content of the unit thus indexed, and that the time to carry out this access is independent of the value of that index.

Whether the value of the index is large or small makes no difference in the access time. Thus, the RAP is crucial for uniform memory access. It is the primary reason why it is justified to treat retrieval of a value from main

Dans le document A ProgrAmmer’s ComPAnion to Algorithm AnAlysis (Page 44-0)