Data Organization - Multi-Dimensional Arrays

Sets and Maps

3.3 Multi-Dimensional Arrays

3.3.2 Data Organization

Most computer architectures provide a mechanism at the hardware level for creat-ing and uscreat-ing one-dimensional arrays. Programmcreat-ing languages need only provide

82 CHAPTER 3 Sets and Maps

appropriate syntax to make use of a 1-D array. Multi-dimensional arrays are not handled at the hardware level. Instead, the programming language typically pro-vides its own mechanism for creating and managing multi-dimensional arrays.

As we saw earlier, a one-dimensional array is composed of a group of sequential elements stored in successive memory locations. The index used to reference a particular element is simply the offset from the first element in the array. In most programming languages, a multi-dimensional array is actually created and stored in memory as a one-dimensional array. With this organization, a multi-dimensional array is simply an abstract view of a physical one-dimensional data structure.

Array Storage

A one-dimensional array is commonly used to physically store arrays of higher dimensions. Consider a two-dimensional array divided into a table of rows and columns as illustrated in Figure 3.6. How can the individual elements of the table be stored in the one-dimensional structure while maintaining direct access to the individual table elements? There are two common approaches. The elements can be stored in row-major order or column-major order. Most high-level programming languages use row-major order, with FORTRAN being one of the few languages that uses column-major ordering to store and manage 2-D arrays.

22 1515 4545 1313 7878 4040 1212 5252 9191 8686 5959 2525 3333 4141 66

0 1 2 3 4

0 1 2

Figure 3.6:The abstract view of a sample3×5two-dimensional array.

In row-major order, the individual rows are stored sequentially, one at a time, as illustrated in Figure 3.7. The first row of 5 elements are stored in the first 5 sequential elements of the 1-D array, the second row of 5 elements are stored in the next five sequential elements, and so forth.

In column-major order, the 2-D array is stored sequentially, one entire column at a time, as illustrated in Figure 3.8. The first column of 3 elements are stored in the first 3 sequential elements of the 1-D array, followed by the 3 elements of the second column, and so on.

For larger dimensions, a similar approach can be used. With a three-dimensional array, the individual tables can be stored contiguously using either row-major or column-major ordering. As the number of dimensions grow, all elements within a single instance of each dimension are stored contiguously before the next in-stance. For example, given a four-dimensional array, which can be thought of as an array of boxes, all elements of an individual box (3-D array) are stored before the next box.

3.3 Multi-Dimensional Arrays 83

Figure 3.7: Physical storage of a sample 2-D array (top) in a 1-D array using row-major order (bottom).

Index Computation

Since multi-dimensional arrays are created and managed by instructions in the programming language, accessing an individual element must also be handled by the language. When an individual element of a 2-D array is accessed, the compiler must include additional instructions to calculate the offset of the specific element within the 1-D array. Given a 2-D array of sizem×nand using row-major ordering, an equation can be derived to compute this offset.

To derive the formula, consider the 2-D array illustrated in Figure 3.7 and observe the physical storage location within the 1-D array for the first element in several of the rows. Element (0,0) maps to position 0 since it is the first element in both the abstract 2-D and physical 1-D arrays. The first entry of the second row (1,0) maps to position n since it follows the first nelements of the first row.

Likewise, element (2,0) maps to position 2n since it follows the first 2nelements in the first two rows. We could continue in the same fashion through all of the rows, but you would soon notice the position for the first element of theith row is

 

column 0 column 1 column 2 column 3 column 4

Figure 3.8: Physical storage of a sample 2-D array (top) in a 1-D array using column-major order (bottom).

84 CHAPTER 3 Sets and Maps

n∗i. Since the subscripts start from zero, theith subscript not only represents a specific row but also indicates the number of complete rows skipped to reach the ith row.

Knowing the position of the first element of each row, the position for any element within a 2-D array can be determined. Given an element (i, j) of a 2-D array, the storage location of that element in the 1-D array is computed as

index2(i, j) =i∗n+j (3.1)

The column index, j, is not only the offset within the given row but also the number of elements that must be skipped in the ith row to reach the jth column.

To see this formula in action, again consider the 2-D array from Figure 3.7 and assume we want to access element (2,3). Finding the target element within the 1-D array requires skipping over the first 2 complete rows of elements:

22 1515 4545 1313 7878 4040 1212 5252 9191 8686

5959 2525 3333 4141 66 n

and the first 3 elements within row 2:

5959 2525 3333 4141 66

Plugging the indices into the equation from above results in an index position of 13, which corresponds to the position of element (2,3) within the 1-D array used to physically store the 2-D array.

Similar equations can be derived for arrays of higher dimensions. Given a 3-D array of size d1 ×d2 ×d3, the 1-D array offset of element (i1, i2, i3) stored using row-major order will be

index3(i1, i2, i3) =i1∗(d2∗d3) +i2∗d3+i3 (3.2) For each component (i) in the subscript, the equation computes the number of elements that must be skipped within the corresponding dimension. For example, the factor (d2∗d3) indicates the number of elements in a single table of the cube.

When it’s multiplied byi₁ we get the number of complete tables to skip and in turn the number of elements to skip in order to arrive at the first element of table i1.

3.3 Multi-Dimensional Arrays 85

d 2

d₃ i₁

The remaining part of the equation (i₂∗d₃+i₃) is equivalent to index₂(i₂, i₃), which indicates the number of elements to skip within thei1 table. As the number of dimensions increase, additional products are added to the equation, one for each new dimension. For example, the equation to compute the offset for a 4-D array is index4(i1, i2, i3, i4) =i1∗(d2∗d3∗d4) +i2∗(d3∗d4) +i3∗d4+i4 (3.3) You may notice a pattern developing as the number of dimensions increase.

This pattern leads to a general equation for computing the 1-D array offset for element (i1, i2, . . . , in) within ann-dimensional array:

index(i₁, i₂, . . . , i_n) =i₁∗f₁+i₂∗f₂+· · ·+in−1∗fn−1+i_n∗1 (3.4) where the f_j values are the factors representing the number of elements to be skipped within the corresponding dimension and are computed using

fn= 1 and fj =

k=j+1

dk ∀0<j<n (3.5)

The size of a multi-dimensional array is fixed at the time it’s created and cannot change during execution. Likewise, the several f_j products used in the equation above will not change once the size of the array is set. This can be used to our advantage to reduce the number of multiplications required to compute the element offsets. Instead of computing the products every time an element is accessed, we can compute and store the factor values and simply plug them into the equation when needed.

Dans le document This page intentionally left blank (Page 101-105)