Dynamic Structures and Garbage Collection

Part 2 The Software Side: Disappointments and

6 Implications of Compiler and Systems Issues

6.3 Dynamic Structures and Garbage Collection

Dynamic data structures are data structures whose size can vary during the execution of a program or from one run of the program to the next. Because their size is not known at compile time, processes must be set up that allow the allocation of dynamic data structures during run time. Numerous data structures are considered dynamic data structures, among them trees, linked lists, queues, and stacks. In some cases instances of these structures are repre-sented by a fixed-size array, but then they are no longer true dynamic data structures. These simulated structures will fail to function properly once their maximum size (as hard-coded into the array representation) is exceeded. A key aspect of true dynamic data structures is the absence of the test of the type

if data structure full then reject insert operation.

In other words, a dynamic data structure can never be full. It is unbounded in size, at least when used in an algorithm.

The unboundedness of dynamic data structures is a reasonable assumption for algorithms, since in this worldview, memory is assumed to be unlimited.

It is, however, a very tenuous assumption for software, since memory is not unlimited in any practical computing system. The way around this problem is to allocate a pool or heap of memory (usually of fixed size) to be used to accommodate requests for space generated during run time. To make this paradigm work well, it is desirable that concomitant with requests for the allocation of space, instructions for freeing or deallocating space be issued by the program and processed by the run-time support system. This is one way to reuse dynamically allocated memory that is no longer needed by the program. The burden of determining which memory locations are no longer needed lies with the programmer in this approach.

In another way of freeing space the run-time support system determines autonomously that certain memory locations can no longer be referenced by the program. If a memory location can no longer be referenced by a program, it is useless and can be recycled. If memory is needed, space considered available in this process can then be collected and allocated to satisfy requests. In this case the programmer is not burdened with the task of issuing explicit deallocation instructions. This appears to be a very attractive approach until one examines its implications more carefully. (In this sense, it is similar to VMM.)

Let us first consider allocation requests. Whenever a unit of memory is needed in a program, the program issues a request for memory. This may involve the creation of a new node of a tree or a stack, or it may mean allocating an array of a size that has just been read in as input. These two cases are different, simply because the new node tends to be a very small unit of memory while the dynamic array is most likely much larger. It turns out that the size of an allocation request is important, since it also implies the size of the memory that is to be freed when a concomitant deallocation

C6730_C006.fm Page 145 Friday, August 11, 2006 9:21 AM

146 A Programmer’s Companion to Algorithm Analysis instruction is issued. It is not so much the size, but the variation in size that has important implications for the gathering of freed-up space. Specifically, if all allocation requests have the same size (for example, in Lisp programs), it is relatively simple to devise highly efficient strategies for collecting unused space.⁶ If, however, the allocation requests are very different in size, it becomes much more difficult to allocate memory efficiently.⁷

For our purposes, it suffices to keep in mind that space can only be allocated to a process (program) if it is available. For the space to be available, it is necessary that the run-time support system determine the availability of the space (of the appropriate kind; in particular, space allocated for single entities should be contiguous⁸) and then assign this memory space to the process. Note that for algorithms, this process is unproblematic since there is an unlimited amount of space available. This unlimitedness of space makes it unnecessary to reuse space. Thus, it should be obvious that allocation requests in algorithms are negligible in their effect on the algorithm’s time complexity — they are always O(1).⁹

For programs, one generally assumes that space allocation also takes a constant amount of time, since it is assumed that the size of the allocation request is known at the time of issuance and the process of assigning memory consists of marking a chunk of the appropriate size as in use. Assuming a (contiguous) chunk of memory is known by its starting and end addresses, it is easy to test whether the size of a particular chunk is sufficient for satisfying the request. Strictly speaking, the time complexity may be greater than O(1) because of the question of finding an appropriate chunk (depend-ing on the specific algorithm used for this purpose; numerous techniques are employed for choosing the most appropriate chunk when allocating allocation requests of varying sizes, from best fit, to worst fit, to various buddy systems), and that would ordinarily depend on the number of chunks available (except in the case of uniform-sized requests, called cells, à la Lisp, where the run-time support system maintains a linked list of free cells and

6 Simply maintain a linked list of cells (allocation units of identical size) that are available. Ini-tially, the list contains the entire space (all cells) available. Any request for a cell is satisfied by supplying the next available cell and removing it from the list. Any deallocation request simply places the freed cell at the end of the linked list of available cells. In this way, allocation and deal-location can be done in time O(1).

7 This is related to the fact that allocation requests are usually processed so that the entire space request is allocated contiguously in memory. If the pool of available memory is fragmented, it is possible that no contiguous chunk of memory of a required size is available, even though the sum of all free memory chunks far exceeds the request. In this case, it will be necessary to carry out a memory processing step in which memory fragments are collected and compacted so that a large contiguous chunk of memory is created. The complexity of this compaction process is of great concern for us.

8 In particular, for dynamic arrays, it must be contiguous, since otherwise none of the standard memory-mapping functions (which preserve the random access property) are applicable.

9 Note, however, that the initialization of an element need not be O(1); only the actual allocation of the memory takes O(1) time. More specifically, if a node in a linear list consists of a pointer and a 2D matrix of size n², the allocation of this space of size n² + O(1) can be done in time O(1), but any initialization of the matrix would take at least an additional O(n²) time.

C6730_C006.fm Page 146 Friday, August 11, 2006 9:21 AM

Implications of Compiler and Systems Issues for Software 147 allocates cells from this linked list upon request by a program instruction in constant time). Nevertheless, the time required to carry out an allocation is usually fairly negligible — provided there is a readily available chunk of memory to satisfy that request. This is precisely the problem.

Note that as described, the process of allocating a chunk of memory does not include any compaction operation; this would be required if there was no single chunk of memory of a size sufficient to accommodate the request, but there are numerous smaller free chunks the sum of whose sizes exceeds the size of the request. Memory compaction techniques may take a significant amount of time. Worse, this amount of time may be incurred at completely unpredictable times. It is even possible that running the same program twice with substantially identical data sets may result in very different running times, strictly because of the time taken by memory compaction.

Memory compaction has to be done when memory fragmentation impedes the satisfaction of a request whose size does not exceed the available memory.

A memory compaction algorithm typically must examine the available chunks of memory, because a sequence of allocations and deallocations of memory requests of varying sizes will often result in memory fragmentation.

This means that even though we started out with one large chunk of (con-tiguous) memory, after allocating and subsequently freeing portions of mem-ory, we may end up rather quickly with relatively small chunks of free memory. To illustrate this, assume we have memory M[1:1000] available and consider the following sequence of memory requests R_i(s_i) and deallocation requests D_i, where R_i is request number i, which is of size s_i, and D_i frees the memory allocated when processing request R_i:

R₁(200), R₂(400), R₃(200), R₄(200), D₂, D₄, R₅(500).

The first four requests present no problem; we may allocate [1:200], [201:600], [601:800], and [801:1000], respectively, to these requests. After executing the two deallocation operations, only the locations [1:200] and [601:800] remain occupied; all other locations are free. However, it is impossible to satisfy the fifth request, because there is no contiguous chunk of size 500, even though there are altogether 600 free locations.

Up until now, it should be evident that both allocation and freeing oper-ations can be carried out efficiently.¹⁰ However, realizing that the request R5(500) cannot be satisfied, we have to compact the remaining chunks to

10 This requires the use of a suitable data structure that allows access to a specific, previously allo-cated chunk. Also, an algorithm must be employed for the determination of the most appropriate chunk from which to satisfy a request. For example, if the fifth request were not of size 500, but 100, we would have two possibilities: allocate out of the chunk [201:600] or allocate out of the chunk [801:1000]. If the best fit strategy is employed, the smallest possible chunk would be selected, that is, the chunk [801:1000] in this case. If the worst-fit strategy is employed, the request would be sat-isfied out of the largest chunk, [201:600]. While worst fit seems to be counterintuitive (why use a larger chunk than necessary?), it turns out that best fit results in worse fragmentation since what is left after satisfying a request from the smallest possible chunk is a much smaller chunk (which is likely to be useless for all subsequent requests) than if one satisfies a request from the largest chunk (where the remnant is more likely to satisfy a subsequent request).

C6730_C006.fm Page 147 Friday, August 11, 2006 9:21 AM

148 A Programmer’s Companion to Algorithm Analysis consolidate them into a large chunk. This requires shifting one or more chunks of memory to different locations. In our (very simple) example, we can either shift the memory for R₁ to [801:1000] or we can shift the memory for R₃ to [201:400]. Only after this operation is carried out can the fifth request R₅(500) be satisfied. The problem is that this shifting operation takes an amount of time that is linear in the sum of the sizes of the chunks to be shifted; this can be proportional to the size of the memory pool. Moreover, it is not at all transparent to the programmer when such a compaction operation is invoked by the run-time support system. Thus, its occurrence appears to be entirely unpredictable, result-ing in situations where a feasible request for an allocation of a certain size is carried out instantaneously and a later, equally feasible request of the same size seemingly halts execution of the program for an inexplicably long time. (We say a request is feasible if the amount of free memory exceeds the size of the request.) Understanding the role of memory com-paction will at least help in understanding why this may happen.

We now indicate how substantially identical runs may result in different compaction behavior. Programs on modern computing systems do not execute in isolation; numerous other processes operate at the same time. Variations in the behavior of these processes may have subtle impacts on the availability of memory to our program, which in turn can cause major repercussions for its compaction algorithm. To illustrate this, assume in our example that the amount of memory was M[1:1100] instead of M[1:1000]. In this case the fifth request would not necessitate a compaction operation. However, if an external process causes the amount of available memory to shrink to M[1:1000], compaction must be carried out.¹¹ Most programmers do not know what the size of the pool of memory is. In some systems, this is not a fixed quantity. Even if the pool size is fixed and cannot be affected by other processes, a permutation of requests for allocation and deallocation may cause significant differences in run time. Assume that the sequence of requests of allocation and deallocation in our example was reordered as follows:

R1(200), R3(200), R2(400), R4(200), D2, D4, R5(500).

It follows that after the two deallocation operations, only the locations 1:400 are occupied; therefore, the request R5(500) can be satisfied without compac-tion. Even though both sequences contain the same allocation and deallocation requests, their permutation may cause differing compaction operations.

Note that allocation requests may occur explicitly or implicitly. For exam-ple, the expansion of the recursion stack during execution of a recursive function is always implicit, but the allocation of a node in a linear list or a binary tree is usually explicit. Deallocation requests can also occur explicitly or implicitly. In the case of the recursion stack, deallocation would be always

11 Note that the program may at no time need more than 1000 units of memory. Therefore, the reduction in the memory pool from 1100 to 1000 units may appear entirely reasonable, but this ignores the interplay between allocation and deallocation requests.

C6730_C006.fm Page 148 Friday, August 11, 2006 9:21 AM

Implications of Compiler and Systems Issues for Software 149 implicit. However, for the node of a list or a tree, we can free the memory explicitly using an instruction provided for this purpose by the programming language, or the run-time support system may implicitly determine that a particular node can no longer be accessed in the program. For example, when deleting a node in a linear list, we may merely change pointers so that there is no longer any way to access that node. Run-time support systems have methods of determining that such a situation has occurred and that there is no other way of accessing that node (for example, because we explicitly had a pointer pointing to the node); in this case the memory assigned to that node is available and can therefore be freed.

Explicit deallocation tends to be carried out when the corresponding instruction is executed. Implicit deallocation usually occurs only when necessary, that is, when a request can no longer be satisfied using the currently available free space. In implicit deallocation, memory is free only after the deallocation has been carried out. Just because it is possible to determine that a certain space could be freed (and therefore reused) does not mean it is free. As long as no deallocation is done, that space is still considered occupied. In implicit deallocation, operations for freeing up memory are effectively batched together, and the deallocation of these several operations is carried out as one step. Implicit deallocation tends to be more complicated and time-consuming, so it typically is carried out only when necessary. Since it is more complicated, its impact on execution time can be fairly dramatic when it occurs. Again, this event tends to be completely unpredictable from the programmer’s point of view.

Some run-time support systems cleverly manage to combine the worst of both worlds: Although the programmer issues explicit deallocation instruc-tions, the run-time support system only collects memory when needed.¹² While this may be convenient since in this way only one type of deallocation algorithm must be carried out (note that in many programming languages, both implicit [think recursion stack] and explicit [think dynamic structures] deallocation is required), it does mislead the programmer into thinking that deallocation of space occurs whenever an explicit deallocation instruction is issued.

The upshot of this section is that the programmer should know what type of memory deallocation is done in a specific compiling and run-time support system; this may include various types of memory compaction. While this knowledge does not guarantee that no disappointments happen, at least these disappointments will no longer be inexplicable. In many cases,

know-12 There have even been compilers that ignore deallocation instructions altogether; in effect, they acted as if the memory model were that of algorithms — no limits on available memory. Thus, even though the programming language provides instructions for freeing up memory and a pro-gram may execute such instructions, the compiler acts as if no such instructions exist. This approach can work, either because enough memory exists for a specific program (especially if the programs targeted by this compiler are “toy” programs, with unrealistically low memory requirements) or when coupled with VMM. However, for many applications the run-time behavior may become even more inexplicable than when explicit deallocations are carried out.

C6730_C006.fm Page 149 Friday, August 11, 2006 9:21 AM

150 A Programmer’s Companion to Algorithm Analysis ing the enemy makes it easier to defeat it, although in the end the programmer is still at the mercy of systems not under her control.

Dans le document A ProgrAmmer’s ComPAnion to Algorithm AnAlysis (Page 156-161)