Computational Model - Managing Scheduled Routing With A High-Level Communications Language

With the language presented in outline, the next section turns to a brief look at the computa-tional model presumed by COP, and describes how the COP compiler interacts with a HLL compiler.

As illustrated in Figure 2-2, the front end invokes the COP compiler and passes in the com-munication structure in the form of COP code. The compiler generates a set of functions for each operator that can be used on each node of the mesh to perform communications. These functions are processor code, suitable for integration into the HLL compiler’s internal repre-sentation. The actual router schedule code for scheduled routers is embedded in the generated processor code, since it is the processor that downloads router code at the appropriate moments.

The HLL compiler then generates object code suitable to run on the processors of the mesh (ei-ther as a single program or as multiple node programs, depending on the HLL compiler).

HLL compiler

COP compiler

COP

per−operator load/

communicate code

HLLapplication object code

Figure 2-2: A high-level view of code generation

The current COP compiler assumes that C is the internal representation, and the returned processor code contains suitable C functions. Accordingly, the C compiler is used to perform the integration step between the COP output and the application-specific code.

Each operator’s first argument, as previously mentioned, is a label. These labels define a flat namespace that the COP compiler uses to name the returned functions: thus, the(broadcast

’b 0)operator would cause functions namedb_write()andb_read()to be returned by the compiler (among others). These returned functions are integrated into the HLL compiler’s internal representation at the points where communication is performed.

2.7.1 The I/O Functions

For operators with a directional communications flow, such asbroadcastorstream, the compiler generates write and read I/O functions. The writer node (e.g., the broadcast source) calls the write function with the value to broadcast, while the other nodes in the subset call the read function. The other class of operators are called functional, where nodes both provide and receive values; these include, for example,reduce, prefix, andcshift(circular shift).

For these operator types, the COP compiler generates a single func function, which on all nodes takes an argument and returns a value.

Since COP’s output is a typed language (namely C), operators are currently required to specify the type of the input and output. By default, operators are assumed to pass and return

one-word integer arrays. The :type optional argument for operators can specify different types; the basic types areintorfloat, with a suffix of ‘32’ or ‘64’ to determine word size, and an optional suffix of ‘*’ to indicate an array type. For array types, the:margument is used to specify the message length, if more than a single value is to be passed at once.

2.7.2 The Load Function

An additional function is associated with the compiled output for all operators. This is the load function, which coordinates the use of the router for that operator. It serves a number of purposes:

It loads any necessary router information before beginning to use the operator.

Its arguments are the values for any (runtime) operands, and it is responsible for updating the router when those values change.

For the first operator after a phase change, it coordinates any necessary overhead in changing from the previous phase (such as barriers or required delays).

The I/O functions could be overloaded to handle all of this, by adding additional arguments for run-time operators, then testing when the values change, or when the operator is used for the first time, and updating the router as necessary. However, this seemed inelegant and inefficient, since the HLL compiler knows where to insert the code to perform these tasks before calling any I/O functions.

For a sequential operator, the load function must be placed before the operator’s first use of its I/O function(s), and after the last use of the previous operator’s I/O function(s). Where sequential operators are placed into the same phase, only the first sequential operator will load the router code for that phase. For parallel operators, the load functions for the operators must all be called before the HLL compiler calls any of the I/O functions for the operators.

As is mentioned above, loopconstructs also have a label. This label is used by the com-piler to generate a load function for the loop; such a load function may contain router repro-gramming commands that can be hoisted out of the loop, and thus executed only once per entering the loop. No I/O functions are associated with loop labels.

2.7.3 COP/HLL Integration Example

Let us consider the simple broadcast example first mentioned in 2.2,(broadcast ’b 0).

Figure 2-3 shows a tiny but complete application; Lisp is used for the examples to simplify the presentation. The first part of the figure contains the one-line COP program for the application.

The second part shows the application code for the master broadcasting node (node zero), and the third part shows the code for the slave (‘worker’) nodes in the mesh. Node zero is passed a list of ‘work’ of some sort, it broadcasts it to the workers, and they perform appropriate work based on, e.g., their coordinates in the mesh. Theworker-functionis the function that does the work, and it is not included in the example; it could be any interesting function (such as, say, computing a portion of a Mandelbrot set).

COP code:

(broadcast ’b 0)

Application code for the master node:

(define (main input) (b-load)

(foreach i input (b-write i)))

Application code for the slave nodes:

(define (main) (b-load) (while t

(worker-function (b-read))))

Figure 2-3: Simple application code

In Figure 2-3, the HLL compiler is assumed to have generated a COP program with an arbitrary label (‘b’), then used that label to generate the appropriate calls to the I/O and load functions (b-load,b-write, andb-read).

If a scheduled router is the target, the COP compiler will take the COP code and return functions similar to those in Figure 2-4. The load function downloads appropriate data (e.g., VFSM configuration and a schedule) to the router using the hypothetical cop-reprogram-routerfunction, then starts the newly-loaded phase running;*new-phase*represents the schedule index of the phase whose information has just been downloaded. The b-write andb-readfunctions check that they are running on a legal node, then write to the proces-sor/router interface (interface address seven is used in the example). *node* is the subset index of each node in the set.

(define (b-load)

(cop-reprogram-router (nth ’((#<router code>) ...) *node*)) (cop-start-phase *new-phase*))

(define (b-write val) (if (= *node* 0)

(cop-write 7 val))) (define (b-read val)

(if (!= *node* 0) (cop-read 7)))

Figure 2-4: COP output for simple application

The HLL compiler takes the output code returned in Figure 2-4 and integrates it with the application code that it has generated for carrying out the computation. Figure 2-5 shows the

final, integrated code for the master node (node zero). The code has been specialized for node zero, so references to *node*have disappeared. The HLL compiler will convert this into object code (including the router code); the object code will then be downloaded to node zero at run time, while the matching code for each slave node (not shown) is similarly downloaded.

Final application code for the master node:

(define (main input)

(cop-reprogram-router ’(#<node 0 router code>)) (cop-start-phase *new-phase*)

(foreach i input (cop-write 7 i)))

Figure 2-5: Final HLL code for node zero

Dans le document Managing Scheduled Routing With A High-Level Communications Language (Page 34-37)