Parameters in Network Construction

Engineering Reliable, Low-Latency Networks

3. Network Organization

3.5 Network Design

3.5.1 Parameters in Network Construction

Table 3.2 summarizes several parameters which will be used in this section when characterizing a network. Radix and dilation were introduced in Sections 3.1.5 and 3.3.5.

ni

^and

no

quantify the number of connections between each node and the network.

i

^and

o

are the number of connections in and out of each router. Generally,

i

⁼

o

⁼

r

d

. Since the number of inputs and the number of outputs on the routing components are the same, we say the routers are square. When we use square routers, the aggregate bandwidth between stages in flat, multistage networks remains constant.

3.5.2 Endpoints

The network endpoints are the weakest link in the network. If we are designing a network with a yield model in mind, in the worst case, we can sustain only min⁽

ni;no

⁾faults. If we are designing a network with a harvest model in mind, in the worst case each min⁽

ni;no

⁾faults will remove an additional node from the operational set.

Once

ni

^and

no

are chosen, we must also ensure that these connections are utilized effectively.

Particularly, to maximize robustness, each must link connect to a distinct routing component in the network. Note, for instance, in the network shown in Figure 3.11, that dilation-1 routers are used in the final stage of the network. These dilation-1 routers are used to achieve maximal fault tolerance by ensuring that the maximum number,

no

⁼2, of distinct routers provide output connections from the network to each node. Figure 3.16 shows another alternative for using dilation-1 routers in the final stage. Rather than using

d

times as many routers with dilation-1 and the base radix unchanged, the network in Figure 3.16 uses routers which increase the radix by a factor equal to the dilation (i.e.

r

final stage ⁼

o

⁼

r

d

^).

Figure 3.16: 1616 Multibutterfly Network with Radix-4 Routers in Final Stage 3.5.3 Internal Wiring

Inside a multipath network, we have considerable freedom as to how we wire the multiple paths between stages. As described in Section 3.1.5, multistage networks operate by successively subdividing the set of potential destinations at each stage. All inputs to routing components in the same equivalence class at some intermediate network stage are logically equivalent since the same set of destinations can be reached by routing through those components. If we exercise this freedom judiciously, we can maximize the fault-tolerance and minimize the congestion within the network, and hence minimize the effects of congestion latency.

d

^[^s^;¹^] ⁽³

:

²⁾

After a stage in the network, the paths will have to diminish in order to connect to the proper destination. Looking backward from the destination node, we see that the paths must grow as the network radix

r

. This constraint is expressed as follows:

p

out⁽

s

⁾⁼

no

^[^s⁰^;¹^] ⁼

no

r

^[(^S⁺¹^);^s⁰^]

s

⁰ ⁼ ⁽

S

⁺¹⁾^ln⁽

r

⁾⁺^ln⁽

no

⁾⁺^ln⁽

d

⁾^;^ln⁽

ni

⁾

ln⁽

d

⁾⁺^ln⁽

r

⁾

s

⁰ ⁼ ⁽

S

⁺¹⁾^{ln (}

r

⁾⁺^ln^no_ni^d

ln⁽

d

r

⁾ ^(3.4)

Once Equation 3.4 is solved for

s

⁰, we can quantify the number of connections into each stage of the network by Equation 3.5.

p

⁽

s

⁾⁼

ni

d

^[^s^;¹^]

s < s

⁰

min⁽

ni

d

^[^s^;¹^]

;no

r

^[(^S⁺¹^);^s^]⁾

⁾⁼

ni

d

^[^S^;¹^] ⁽³

:

⁶⁾

For example, consider the network in Figure 3.11 (

ni

⁼

no

⁼

r

⁼

d

⁼ ^2,

S

⁼ 4). Solving Equation 3.4 for

s

⁰^{, we find}

s

⁰ ⁼ 3. The number of connections into each stage can then be calculated as shown in Table 3.3. The total number of paths is simply 22³ ⁼ 16. Noting Figure 3.11, we see it does achieve this maximum path expansion for the highlighted path; the paths between all other source and destination pairs in Figure 3.11 also achieve this path expansion.

^Expansion

Unfortunately, path expansion can be a naive metric when optimizing the aggregate fault-tolerance and performance of a network. Path expansion looks at a single source-destination pair and tries to maximize the number of paths between them. If we only considered path expansion in selecting a network design, many nodes could share the same sets of routers and connections in their paths through the network. This sharing would lead to a higher-degree of contention. Additionally, when faults accumulate in the network, a larger number of nodes are generally isolated from the

Figure 3.17: Left: Non-expansive Wiring of Processors to First Stage Routing Elements Figure 3.18: Right: Expansive Wiring of Processors to First Stage Routing Elements rest of the network at once. Consider, for instance, the two first stage network wirings shown in Figure 3.17 and 3.18. Both wirings are arranged such that each processor connects to two distinct processors in the first stage of routing. However, the wiring shown in Figure 3.17 has four processors which share a pair of routers, whereas any group of four processors in the wiring shown in Figure 3.18 is connected to five routers in the first stage. As a result, there will generally be less contention for connections through the first stage of routers in the latter wiring than in the former.

Leighton and Maggs introduced

expansion to formalize the desirable expansion properties as they pertain to groups of nodes which may wish to communicate simultaneously [LM89].

Informally,

expansion is a metric of the degree to which any subset of components in one stage will fan out into the next stage. More formally, we say a stage has

^expansion⁽

;

⁾^{if any}

subset of

components from one stage must connect to at least

components in the next stage.

is thus an expansion factor which is guaranteed for any set of size

. Networks with favorable

expansion are networks for which the

expansion property holds with higher

^{for each}

value of

. The more favorable the

expansion, the more messages can be simultaneously routed between any sets of communicating processors, and hence the lower the contention latency.

Networks Optimized for Yield

If we cannot tolerate node loss, and hence wish to optimize the fault-tolerance of the network as a yield problem, then it makes sense to focus on achieving the maximal path expansion first, then achieving as large a degree of

expansion as possible. Unfortunately, there is presently no known algorithm for achieving a maximum amount of

expansion, so the techniques presented here are heuristic in nature.

To achieve maximum path expansion, we connect the network with the algorithm listed in Figure 3.19 [CED92]. The paths from any input to any output may fanout by no more than a factor of d, the dilation of the routers, at each stage. This fanout may also become no larger than the size of the routing equivalence classes at that stage. The routine groupsz returns the maximum fanout size allowed by both of these factors. Each stage is partitioned into fanout classes of this size, which are then used to calculate network wiring. The maximum path fanout described in Equation 3.5 is achieved by this algorithm for all pairs of components.

As introduced above, the last stage is composed of dilation-1 routers to increase fault tolerance.

Figure 3.20 shows a deterministically-interwired network composed of radix-2 routers.

Networks Optimized for Harvest

To achieve a high harvest rate and maximize performance, we want to wire networks with a high degree of

expansion. As introduced above, there are no known deterministic algorithms for achieving an optimal expansion. In practice, randomized wiring schemes produce higher expansion than any known deterministic methods. [Kah91] presents some of the most recent work on the deterministic construction of expansion graphs. [Upf89] and [LM89] show that randomly wired multibutterflies have good expansion properties. The high expansion generally means there will be less congestion in the network. Additionally, Leighton and Maggs show that after

k

faults have occurred on a

N

node machine, it is always possible to harvest

N

O

⁽

k

⁾nodes [LM89].

As introduced in Section 3.1.5, multistage networks operate by successively subdividing the set of potential destinations at each stage. All the inputs to routing components in the same equivalence class at some intermediate stage in the network, are logically equivalent. After the routing structure determines which set of outputs in one stage must be connected to which set of inputs in the following stage, we randomly assign individual input-output pairs within the corresponding sets.

Figure 3.21 shows the core of an algorithm for randomly wiring a multibutterfly. The algorithm was first introduced in [CED92] and is based on the wiring scheme described in [LM89]. In practice,

.

Returns the next-stage router to which to wire for maximum path expansion wire to port(

n

d

p^,

s

⁾

. n

=router number,

d

p=dilated port number,

s

=router stage 1

outgrpsz

^groupsz(

s

⁾

Dans le document Robust, High-Speed Network Design for Large-Scale Multiprocessing (Page 49-54)