With C in TPF - Two Dimensional Discontiguous Arrays
by Dan Evans
Two dimensional arrays should have a special place in the hearts of TPF programmers working with C, because they do not have to be contiguous blocks of storage. The expression object[i][j] in a C program may refer to a conventional block of storage organized into a two-dimensional array of rows and columns, or it may refer to a less conventional array of pointers to one-dimensional arrays which are in discontiguous storage. In practical terms, this means that a TPF C programmer may work with a 1000 by 1000 two-dimensional array of integers directly without concern that the rows of the two-dimensional array are stored in arbitrary discontiguous memory locations. A one million integer object is large enough for most computations, but make it a two-dimensional array of pointers and the capabilities are unlimited. TPF's 4K limit on memory is still felt in the maximum size of each row, which is why I limited the size of a dimension to 1000 integers or 4000 bytes.
In previous columns, I have discussed the equivalence of array references and pointer arithmetic in C. This is the basis for understanding two-dimensional array references, so let's quickly review. There are three rules:
Let's apply these rules to the analysis of a two-dimensional array reference, using the declaration int xyz[50][95];. This declares xyz as an array of 50 arrays of 95 integers. In other words, a two-dimensional array is an array of one-dimensional arrays. In this case, the one-dimensional arrays are each an array of 95 integers. The two-dimensional array puts together 50 of these one-dimensional arrays. In the array reference xyz[i][j], the first subscript, i, selects one of the 50 one-dimensional arrays, and the second subscript, j, selects an integer from that array.
A C compiler decomposes xyz[i][j] in several steps:
The two-dimensional array reference is reduced by the compiler to a series of pointer additions and dereferences. There is still a little more to this than meets the eye, so let's analyze the resulting expression keeping in mind rules 1 and 2. In order to emphasize the generality of this process, I'll use the word AnyType to refer to arbitrary types. Working from inside out, the expression is:
If you think that a 50 by 95 two-dimensional array of integers would take 19000 contiguous bytes of storage and would therefore not be possible in TPF, you would be wrong. Now that we thoroughly understand two-dimensional references, let's look at the same expression xyz[i][j] with the declaration int **xyz;. The compiler again decomposes the expression to *((*(xyz + i)) + j), but the change in the declaration produces differences in the evaluation.
Here is the critical difference. When xyz is declared as a two-dimensional array, the name xyz refers to the location where the compiler has placed the array, whether it is allocated at compile time or it is allocated on the stack as a local variable at run time. When xyz is declared as an indirect pointer, it must be set to the address of an array of pointers, and each element of this array must be set to the address of an array of the indirect type, in our case, an array of integers. Here's how you create a two-dimensional array of the second form:
#define ROWS 50
#define COLUMNS 95
int i, j;
int **xyz;
xyz = (int **)malloc((sizeof *xyz) * ROWS);
for (i = 0; i < ROWS; ++i)
{
xyz[i] = (int *)malloc((sizeof **xyz) * COLUMNS);
for (j = 0; j < COLUMNS; ++j)
xyz[i][j] = 0;
}
In the above code, I have left out the obvious efficiencies in order to emphasize the one and two dimensional array references. If you count the number of separate request for storage, calls to malloc(), you will find there are 51 of them, each no larger than 380 bytes. In other words, this two-dimensional array object is built from 51 small discontiguous pieces suitable for allocation in TPF. The creation code makes it clear that references to the integer elements of the two-dimensional array hide the discontiguous nature of the storage and simplify the programmer's view. This is the perfect marriage of C and TPF.