Structuring the Phase Tree - Searching the Implementation Space

6.2 Searching the Implementation Space

6.2.2 Structuring the Phase Tree

Nodes are added to the phase tree as the compiler traverses through the COP program’s operator tree. These represent both chosen phase breakpoints (for children of sequential phase groups) and MIMD operations (for children of disjoint phase groups).

Sequential and Parallel Groups

As the compiler walks the tree, if it traverses from one child of a sequential group to the next child, it may choose to try closing the current active phase. It does this by placing both the closing and non-closing alternatives into the priority queue as separate search nodes. Closing a phase is itself discussed further below.

For a parallel group, the current COP definition requires that only the last child may have sequential descendants (as discussed in Section 6.1.4). When the compiler comes to the last child, it pushes the current active phase’s operator list to the top of the parallel context stack in the search node. Since there are guaranteed to be no phase closures in the other children, there is only one unclosed phase in the tree, and it is currently active. The current parallel context is then used in the group’s descendants to initialize each newly opened active phase. When the compiler finishes traversing into the last child, it pops the parallel context item off the search node’s context stack and continues.

Disjoint Groups

Disjoint groups are more difficult to handle. When the compiler first encounters a disjoint group in the operator tree, it immediately creates a phase-tree disjoint group following the current active phase, which it leaves unclosed. As it processes each child of the disjoint group, the compiler creates a child of the phase tree disjoint group and sets the active phase to point to it. As the compiler recurses down the operator tree, that active phase may in turn be split into further sequential groups, and indeed further nested disjoint groups.

The complexity comes when the children of a disjoint group are finished. The compiler removes the first sequential phase from each child of the disjoint and bundles all their operators together into a single set. This set is then placed in a phase immediately before the disjoint phase group—either the existing previous phase (if it wasn’t closed), or else a newly created phase. Either way, the result is a single phase holding all the operators active at the start of the disjoint phase. The compiler then closes the combined phase (as discussed below).

Combining the first phase of each disjoint section allows the compiler to handle terminating the combined phase with a single barrier if necessary, which may be required if operators spanning the disjoint group are ending in this phase.

In the second part of the disjoint-group close, the compiler takes the last sequential phase from each disjoint group’s child. These phases are removed from the disjoint, and their op-erators are collected into a new phase, following the disjoint phase group. This phase now becomes the active phase. If the disjoint group is a child of a sequential group, the compiler handles forking the search node into close/noclose options using the aggregated active phase in the usual manner. As mentioned at the end of Section 5.2, if a barrier is necessary it will be configured to operate independently over each disjoint subset.

The complexity of the disjoint handling not only allows correct behavior under certain cir-cumstances (such as terminating operators that cover multiple disjoint phases, as mentioned in Section 5.2.3), it also simplifies other parts of the compiler. For example, even in the presence of disjoint operators, stream routing can be done on single phases rather than needing to collect several disjoint phases at once. Similarly, the ‘active phase’ remains a single phase at all times, rather than expanding to become a disjoint subtree of active phases, as would be the case if

dis-(sequential A

(disjoint

(sequential B C D E) (sequential F G H)) I)

Figure 6-3: COP disjoint example (psequential

A,B,F (pdisjoint

(psequential C D) G)

E,H I)

Figure 6-4: One possible way to handle disjoint COP

joint children were not pulled out into a single phase at the end of a disjoint construct. Finally, this algorithm results in the first phase listed for an operator always temporally preceding all other phases listed for that operator.

As a simple example, consider the COP program in Figure 6-3. Let us consider a search node where the compiler closed after after all the sequential children except A. After converting the operators into phases and applying the disjoint closure algorithm above, the result is the set of phases shown in Figure 6-4; ‘psequential’ and ‘pdisjoint’ refers to phase-tree grouping constructs.

As another example, disjoint groups in the operator graph may be lost altogether after conversion to phases—for example, if the application contains two intersecting disjoint sets such as rows and columns of a 2D array. In that case, the user might input

(parallel rowop1 rowop2 ... rowopn colop1 colop2 ... colopn) The initial disjoint generation (as discussed in Section 6.1) would result in an an internal form with two disjoint groups under a parallel operator. As the compiler evaluated this form, the first disjoint group would resolve into a single phase holding all therowops. When the compiler moved into the last child of the parallel therowops would be moved onto the parallel context stack, and pushed onto each child phase of the disjoint group as it was evaluated.

Popping out of the second disjoint would then combine all the children (taking care not to duplicate the

n

rowops, one from each child) into a single phase, thus eliminating all traces of the intermediate disjoint group.

Dans le document Managing Scheduled Routing With A High-Level Communications Language (Page 92-95)