• Aucun résultat trouvé

Minimization of circuit registers: retiming revisited

N/A
N/A
Protected

Academic year: 2021

Partager "Minimization of circuit registers: retiming revisited"

Copied!
12
0
0

Texte intégral

(1)

HAL Id: inria-00072480

https://hal.inria.fr/inria-00072480v2

Submitted on 24 Jul 2007

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Minimization of circuit registers: retiming revisited

Bruno Gaujal, Jean Mairesse

To cite this version:

Bruno Gaujal, Jean Mairesse. Minimization of circuit registers: retiming revisited. Discrete Applied Mathematics, Elsevier, 2007. �inria-00072480v2�

(2)

Minimization of circuit registers: retiming revisited

Bruno Gaujal∗ Jean Mairesse† December 2, 2005

Abstract

In this paper, we address the following problem: given a synchronous digital circuit, is it possible to construct a new circuit computing the same function as the original one but using a minimal number of registers? The construction of such a circuit can be done in polynomial time and is based on a result of Orlin for one periodic bi-infinite graphs showing that the cardinality maximum flow is equal to the size of a minimum cut. The idea is to view such a graph as the unfolding of the dependences in a digital circuit.

1

Introduction

Digital circuits can be seen as a rather accurate model of computer hardware. Their design and optimization is a long lasting challenge on several viewpoints. For instance the problems of layout compaction [4], verification [5], placement and partitioning [4] have been intensively studied in the VLSI literature, see for instance [8].

On an abstract point of view, synchronous digital circuits are often seen as finite graphs consti-tuted by functional gates, wires and registers. At each clock tick, functional elements transform input data into output data which are transmitted on the wires to the next nodes. A register is a storage facility, or a memory cell, of finite size.

Several optimizations are possible at that level. One may want to accelerate the clock frequency ([7]) or reduce the number of nodes in the circuit. In this paper we show how to minimize the number of registers. This problem has already been considered by Leiserson and Saxe in a seminal paper [7], where they show how to retime (this will be defined later) the circuit in order to reduce the number of registers. They also provide an algorithm computing the best possible retiming. One question remains: is it possible to do better? In other words, can one modify a circuit so that the number of registers is smaller than what the optimal retiming does, while keeping the original functional behavior?

The answer is yes and no. For many circuits retiming is indeed optimal. In those where it is not, the gain in the number of registers comes at the expense of additional functional nodes.

INRIA, Laboratoire ID-IMAG,CNRS,UJF,INPG, 51 Av. J. Kuntzmann, 38330 Montbonnot, France. Email:

bruno.gaujal@imag.fr

LIAFA, CNRS-Universit´e Paris 7, Case 7014, 2 place Jussieu, 75251 Paris Cedex 05, France. Email: mairesse@liafa.jussieu.fr.

(3)

More precisely, the contribution of that paper is as follows: (i) we show that retiming is not always optimal; (ii) we provide an algorithm providing a circuit with the minimal number of registers; (iii) and we characterize those circuits where retiming is indeed optimal.

A preliminary paper on that subject [3] considered a particular class of circuits, namely recycled one. Here, general circuits are considered and the proof is different and based on a result of Orlin [9] on one-periodic bi-infinite graphs, which is a “max-flow min-cut” theorem. Such graphs have mainly been used in scheduling applications [10], Uniform Recurrence Equations or PRAM program loops [1, 6], where the weights on the arcs represent time and where the flow is a crucial quantity. Here instead, we view weights as memory resources and the cut becomes the central notion.

The paper is organized as follows. Section 2 contains basic definitions and notation and formu-lates the above-mentioned “max-flow min-cut” theorem. Section 3 discusses how the obtained results can be used in design of digital circuits to minimize the number of registers and Section 4 illustrates the complete construction over an example.

2

One-periodic bi-infinite graphs and Orlin’s Theorem

The sets of nonnegative and non-positive integers are denoted by Z+ and Z−, respectively.

By a bi-infinite graph we mean an infinite directed graph D = (V, A) with node set V = (V × Z) and arc set A ⊂ V × V, assuming throughout that V is a finite set and that D is locally finite, i.e., each node is incident with a finite number of arcs.

For u ∈ V and i ∈ Z, the u-th line in D is formed by the nodes (u, j) for all j ∈ Z, and the i-th column by the nodes (w, i) for all w ∈ V . For brevity, for a node v = (u, i) and an integer k, the node (u, i + k) is denoted by v + k, and similarly for a set of nodes. A set S of nodes is called consecutive (in each line) if (u, i) ∈ S and (u, i + k) ∈ S with k > 0 imply (u, i + h) ∈ S for each h= 0, . . . , k.

A path in D is an alternating sequence P of (not necessarily distinct) nodes vi (i ∈ I) and arcs

(vi, vi+1) (when i + 1 ∈ I). Here I is either {0, . . . , n} for n ∈ Z+ (yielding a finite path) or

I = Z+ or I = Z− or I = Z (a bi-infinite path). We also use notation · · · → vi → vi+1 → · · ·

for P and, depending on the context, may consider a path as a subgraph of D. For two nodes u and v, we write u→ v if there exists a (finite) path from u to v.∗

Two additional conditions on the bi-infinite graphs we deal with are imposed. A bi-infinite graph D is said to be one-periodic if

(OP) for any two nodes v and v′ of D, (v, v) ∈ A if and only if (v + 1, v+ 1) ∈ A,

and causal if

(C) for any infinite path P indexed by Z+, the set P ∩ (V × Z−) is finite.

Note that properties (OP) and (C) imply that D is acyclic.

The result presented below is analogous to the classical Menger theorem for usual finite graphs. A flow F is a set of pairwise (node-)disjoint bi-infinite paths. It is one-periodic for each path P in F, the path P + 1 belongs to F as well. A cut a set of nodes that intersects every bi-infinite path. Clearly the size of a flow cannot exceed the size of a cut. Also if D is one-periodic and

(4)

causal, then the set V × {0, . . . , k} forms a cut, where k := max{|i − j| | ((u, i), (w, j)) ∈ A} (k is finite since D is locally finite and one-periodic). Therefore, the maximum cardinality of a flow is finite.

Theorem 2.1 (Orlin [9]). Let D be a bi-infinite graph satisfying (OP) and (C). The maximum cardinality of a flow is equal to the minimum cardinality of a cut. Moreover, the maximum is attained by a one-periodic flow and the minimum is attained by a consecutive cut.

Theorem 2.1 can be seen as a special case of Theorem 4 in [9]. Getting the above statement requires one transformation. Each node in D is replaced by a triple (node-arc-node) on the same column. Now, Theorem 4 in [9] with upper and lower capacities on all the arcs set to 1 and 0 respectively, is exactly Theorem 2.1.

An example illustrating a maximum one-periodic flow and a minimum consecutive cut is drawn in Figure 1.

Figure 1: An illustration to Theorem 2.1.

A representation of D in a compact form called a is obtained by “folding” D into a finite arc-weighted digraph G = (V, E, ∆) with possible multiple arcs, as follows. Let us say that arcs ((u, i), (w, j)) and ((u′, i), (w, j)) of D are similar if u = u, w = w, and j − i = j− i. Since

D is locally finite and one-periodic, the number of classes under this similarity relation is finite, and each class, with a representative ((u, i), (w, j)), generates one arc e in G with tail t(e) := u, head h(e) := w and weight ∆(e) := j − i. We call G the folded graph associated with D (it is called a dynamic graph by Orlin)

In Section 3 we use the reverse construction. Given a finite directed (multi)graph R = (V, E, ∆) with ∆ ∈ ZE, its unfolding is the one-periodic bi-infinite graph D = (V × Z, A) in which

((u, i), (w, j)) ∈ A if and only if there is an arc e ∈ E with t(e) = u, h(e) = w and j − i = ∆(e). We say that R is causal if D is such, i.e., if ∆(C) > 0 for each cycle C of R.

Splinters. Let D be causal and one-periodic. For a subset S ⊆ V, define

succ(S) = {v ∈ V\S | ∃u ∈ S, (u, v) ∈ A}, pred(S) = {v ∈ V\S | ∃u ∈ S, (v, u) ∈ A} succ∗(S) = {v ∈ V\S | ∃u ∈ S, u→ v},∗ pred∗(S) = {v ∈ V\S | ∃u ∈ S, v → u} .∗

(5)

We assume that each node of D is contained in an infinite path (indexed by Z+ or Z−). Then,

given a consecutive cut C, one can partition the nodes into three sets in a natural way: the set P(C) of nodes “before” C, the set S(C) of nodes “after” C, and C itself. Note that the sets pred∗(C), C and succ∗(C) are pairwise disjoint because of causality but they need not cover the whole graph.

We extend pred∗(C) and succ∗(C) into the desired sets P (C) and S(C), respectively, by

exam-ining D line by line and acting as follows. Three cases are possible.

1. No bi-infinite path goes through line u but some infinite path indexed by Z+ does. Then

there exists a path from u to C and succ∗(C)∩u is empty. On u, we assign the nodes of pred(C)

to P (C), and the other nodes to S(C).

2. No bi-infinite path goes through line u but some infinite path indexed by Z− does. On u, we

assign the nodes of succ∗(C) to S(C), and the other nodes to P (C).

3. There is a bi-infinite path intersecting u. Then succ∗(C), C, and pred∗(C) do partition line u. We make S(C) and P (C) coinciding with succ∗(C) and with pred∗(C) on u, respectively.

It is easy to check that the sets P (C) and S(C) are consecutive, using that the graph D is one-periodic. Also one can see that pred(S(C)) = C and succ(P (C)) = C.

The sets P (C) and S(C) are called, respectively, the negative and positive splinters associated with C. These sets are used in the next section.

3

Application to Register Minimization in Digital Circuits

In this section, the name graph stands for a finite directed multigraph with integer arc-weights, called delays.

A digital circuit is made of gates computing data according to boolean logical functions, wires connecting the gates and memory registers on the wires which are storing the data between two computation cycles. With a digital circuit, we associate the graph R = (V, E, ∆), whose nodes, arcs, and delays correspond respectively to the gates, wires, and number of registers on the wires of the digital circuit. Since the number of registers has to be nonnegative, a specificity of the graph associated with a digital circuit is that it has only nonnegative delays. Also, for physical reasons, any cycle in the digital circuit should contain at least one register. The associated graph is therefore causal. So for our purpose, a digital circuit is simply a causal graph with nonnegative delays. The unfolding of the graph R can be viewed as the graph of the dependences between the computations performed by the digital circuit (the node (i, n) corresponds to the n-th computation at gate i). Our goal is to minimize the number of registers used in a digital circuit in a sense to be made precise below (problem Min-Register). A more thorough discussion of the relation between digital circuits and graphs is proposed in [7, 3].

(6)

3.1 Computations in digital circuits

Let R = (V, E, ∆) be a digital circuit. Let Q be a finite set (corresponding to all the different values that one register can store) and let F be the set of functions from Qk to Q for all k ∈ Z

+.

A specialization σ of the digital circuit consists in mapping one function of F to each gate of R:

σ : V → F u 7→ Fu

The function Fu attached to gate u must have as many arguments as u has input arcs. In

particular, if u is a node with no predecessor, then Fu is a constant (since Fu is a function from

Q0 to Q).

On a wire e with ∆(e) > 0, we denote the registers by (e, 1), . . . , (e, ∆(e)) where the ranking is performed according to the physical order of the registers on the wire. Let M = {(e, n), e ∈ E,1 ≤ n ≤ ∆(e)} be the set of all the registers. An initial condition I assigns an initial value to each register of the digital circuit:

I : M → Q m 7→ I(m)

The computation of (R, σ, I) is the sequence (x(u, n))u∈V,n∈Z+ where x(u, n) ∈ Q is the n-th

value computed at gate u if the values stored initially in the registers have been set by I and if the functions computed at each gate are those given by σ. More formally, we have

x(u, n) = Fu[x(i(e1), n − ∆(e1)), . . . , x(i(ek), n − ∆(ek))], n∈ Z+, u∈ V, (1)

where e1, . . . , ek, are the arcs with terminal node u (listed according to some total order on E),

and where, if n − ∆(ej) < 0, x(i(ej), n − ∆(ej)) = I((ej,∆(ej) − n)) .

The computational power of a digital circuit R is defined as follows. For an arbitrary finite set Q, the sequence (x(u, n))u∈V,n∈Z+ of elements of Q is computable by R if there exists σ and I,

such that the sequence is computed by (R, σ, I).

We say that a digital circuit R2 (nodes V2) has a larger computational power than a digital

circuit R1 (nodes V1) if for an arbitrary finite set Q and for each specialization σ1 of R1, there

exists a specialization σ2 of R2 and an injective mapping

θ: V1× Z+→ V2× Z+, (2)

such that for any initial condition I1 for R1 there exists an initial condition I2 for R2 such

that the computation (x(u, n))u∈V1,n∈Z+ of (R1, σ1, I1) and the computation (y(u, n))u∈V2,n∈Z+

of (R2, σ2, I2) satisfy ∀(u, n) ∈ V1× Z+, x(u, n) = y ◦ θ(u, n) . Roughly speaking, this means

that everything that can be computed by R1 can also be retrieved from a computation carried

over R2. Obviously, this retrieval is efficient only if θ is simple enough.

Remark. In relation to the above point, observe that the computational power of a digital circuit only depends on its topology, namely V , E, and ∆. It is independent of Q, σ, or I.

(7)

3.2 Forward splitting, duplication, and retiming

Given a causal graph R = (V, E, ∆), possibly with some negative delays, we define the total number of delays of R as follows:

∆A(R) =

X

e∈E

∆(e) . (3)

Another quantity of interest is the following one:

∆B(R) =

X

u∈V

max

e∈E,i(e)=u∆(e) . (4)

By causality, we have ∆A(R) ≥ 0 and ∆B(R) ≥ 0. Furthermore, as soon as R contains at least

one cycle, we have ∆A(R) > 0 and ∆B(R) > 0. Since ∆A(R) = Pu∈V Pe∈E,i(e)=u∆(e), we

have ∆B(R) ≤ ∆A(R). Introducing the quantity ∆B is relevant because of the following result.

Proposition 3.1 ([3]). There exists a causal graph denoted ϕ(R) (set of nodes ϕ(V )), with nonnegative delays, such that:

(i) ∆A(ϕ(R)) = ∆B(R);

(ii) ϕ(R) has a larger computational power than R. Furthermore, the map θ in (2) has the following form:

θ: V × Z → ϕ(V ) × Z, (v, n) 7→ (α(v), n + cv) , (5)

where α : V → ϕ(V ) is an injection and cv, v ∈ V, are integer constants, which do not depend

on the specialization and initial condition of R.

In [3, Propositions 6.5 and 7.3], the result is proved for recycled graphs, but it is easy to see that the proof extends to the general case. The graph ϕ(R) is obtained from R by duplication first and then forward splitting.

The duplication transformation is designed to get rid of negative delays without changing ∆B.

An example of the duplication transformation is given in Figure 2. By causality, all cycles have a positive delay; hence, by duplicating nodes successively, it is possible to remove all the negative delays.

Forward splitting was introduced in [7] under the name of “register sharing”. Forward splitting operates on graphs with nonnegative delays, and transforms R into R′ such that ∆A(R′) =

∆B(R). An example of the forward splitting transformation is given in Figure 3.

Let us recall the classical notion of retiming [7]. A retiming is a function r : V → Z. Given a graph R = (V, E, ∆), it specifies a new graph Rr and a new unfolded graph Dr as follows:

• Rr= (V, E, ∆r) with, for e ∈ E, ∆r(e) = ∆(e) + r(i(e)) − r(t(e));

• Dr = (V × Z, Ar) is the unfolding of Rr; that is ((i, n), (j, m)) ∈ Ar ⇐⇒ ((i, n +

r(i)), (j, m + r(j))) ∈ A.

In the example of Figure 4, the new graphs Rr and Dr correspond to the retiming r defined by

(8)

a a a’ −1 3 2 0 3 0 0 2 2 1

Figure 2: Duplication of node a

1 1 0 0 1 0 ϕ(R) : ∆A(ϕ(R)) = ∆B(ϕ(R)) = 3 1 3 2 R : ∆A(R) = 6, ∆B(R) = 3

Figure 3: The forward splitting transformation.

3.3 Solution to the Min-Register problem

We want to solve the Min-Register problem which is defined as follows:

Given a digital circuit R, find another digital circuit with at least the same computational power and with as few registers as possible. This number will be denoted by Min-Reg(R).

Let R = (V, E, ∆) be a digital circuit. Using Proposition 3.1, for any retiming r, the graph ϕ(Rr)

has nonnegative delays and a larger computational power than R. In particular it implies that:

Min-Reg(R) ≤ min

r ∆B(Rr) = minr ∆A(ϕ(Rr)) ,

where the minimum is taken over all possible retimings. The next theorem states that there is in fact equality.

Theorem 3.2. Let R be a digital circuit and let D be its unfolding. Then the minimum number of registers Min-Reg(R) is equal to χ(D), the minimum cardinality of a cut of D. Let C be a consecutive cut of D of minimum cardinality and let S be the corresponding positive splinter. For i∈ V , let ni be such that (i, ni− 1) 6∈ S, (i, ni) ∈ S. Let r be the retiming defined by r(i) = ni.

Then the digital circuit ϕ(Rr) is a solution to the Min-Register problem.

Proof. We first prove that the number of registers of ϕ(Rr) is χ(D). Let fr be the map from

V × Z into itself defined by fr((u, n)) = (u, n − r(u)). Obviously, fr(S) is a positive splinter of

Dr. Furthermore, by definition of r, we have fr(S) = {(u, n), u ∈ V, n ∈ Z+}. Now, the size of

the cut C = pred(S) in D is the same as the size of the cut pred(fr(S)) in Dr. Let us consider

(9)

2 3 -1 2 1 1 1 1 1 0 2 3 Rr R r= (1, 1, 0) D Dr

Figure 4: Retimed graph and its unfolding.

i(e) = u, t(e) = v, ∆(e) = m. There is an arc in Dr from (u, −m) to (v, 0) and no arc from a

node (u, k), k < −m to a node (w, ℓ), ℓ ≥ 0. Hence, by definition, pred(fr(S)) contains exactly

the nodes (u, −m), · · · , (u, −1) on line u. The same argument repeated on each line shows that ∆B(Rr) = |pred(S)| = |C| = χ(D). As recalled above, the graph ϕ(Rr) has nonnegative delays,

a larger computational power than R and satisfies ∆A(ϕ(Rr)) = ∆B(Rr) = χ(D).

In the second part of the proof we show that there exist no digital circuits with at least the same computational power as R and with strictly fewer registers than χ(D). Let R′ = (V, E,) be

a digital circuit with at least the same computational power as R and let D′ be the unfolded graph associated with R′.

According to Theorem 2.1, there exists in D a one-periodic flow of cardinality |C| which defines a bijective mapping from the nodes of C to the ones of C + L, for any nonnegative integer L. Let us choose a specialization σ : u 7→ Fu, in the following way. Consider u ∈ V and let e1, . . . , ek,

be all the arcs in R with terminal node u (listed according to some total order on E). Let el

be the only arc which corresponds to a set of arcs in D belonging to the flow. Then we define Fu : Qk → Q by Fu(x1, . . . , xk) = xl. By composing the functions Fu, we get an application

from the nodes of C to the ones of C + L of the form F : Q|C| → Q|C| which is a permutation of the coordinates. In particular F is bijective.

For instance, consider Figure 1. Rank the nodes of the cut C in the order: (2, k) < (3, k) < (4, k − 1) < (4, k). For L = 1, the corresponding function is F (x1, x2, x3, x4) = (x1, x3, x4, x2).

For L = 2, the function is F (x1, x2, x3, x4) = (x1, x4, x2, x3). For L = 3, the function F is the

identity.

Observe that when we let the initial condition I vary over all the possible values in Q∆A(R) then

the values of the computation of (R, σ, I) in the cut C, namely (x(u, n))(u,n)∈C, cover all the values in Q|C|.

(10)

injective mapping θ : V × Z → V′× Z such that each sequence (x(u, n))u∈V,n∈Z+ computed by

(R, σ, I) for some initial condition I, is related to a sequence (y(u, n))u∈V′,n∈Z+ computed by

(R′, σ′, I′) for an adequate initial condition I′, by x(u, n) = y ◦ θ(u, n).

Let C′be a minimum consecutive cut of D′. Let S′ and P′be the positive and negative splinters associated with C′.

Let θ(C) be the image by θ of the cut C in D′. Since θ is injective, we can assume by translating

S′ and by choosing L large enough, that θ(C) ⊂ P′ , and θ(C + L) ⊂ S′. This means that θ(C) is on the “left” of C′ and θ(C + L) is on the “right” of C′. Let U′ be the subset of V′ consisting of the nodes with no predecessor. By definition, given a specialization σ′ of R, we have

∀u ∈ U′,∃cu∈ Q, ∀I′,∀n ∈ Z+, y(u, n) = cu.

where (y(u, n))u∈U′,n∈Z+ are the values computed by (R′, σ′, I′). We consider all the paths

of D′ ending in θ(C + L). Any such path intersects the cut C′ or is finite and starts in a node of the type (u, n), u ∈ U′. Hence for any node (w, n) ∈ θ(C + L), we have y(w, n) =

G([cu]u∈U′; [y(v, k)](v,k)∈C′) for some function G depending only on σ′. We conclude that for a

fixed specialization of R′, the variables (y(u, n))

(u,n)∈θ(C+L) can take at most |Q||C

|

different values when the initial condition I′ varies. Since F is bijective from Q|C| into itself, it follows

that |C| ≤ |C′|.

Proposition 3.3. Let R be a digital circuit with n functional elements and m arcs. The circuit ϕ(Rr) can be constructed in O(n(m + n log n)) units of time.

Proof. The construction can be decomposed into several steps.

1. Computing the “max-flow min-cut” can be reduced to a maximum-weight perfect matching problem in a bipartite (undirected) graph and its dual one, which can be done with time com-plexity O(n(m + n log n)); see, e.g., [2].

2. The retiming and forward splitting operations are local and take O(m) units of time.

3. The duplication operation is also local. The number of duplications is bounded byPe∈E∆(e) = O(nm).

Theorem 3.2 deserves several comments:

1. Given a digital circuit R with unfolding D, the quantity χ(D) can be seen as the intrinsic quantity of memory needed to carry all the computations which could be wired by R.

2. In the degenerated case where R is acyclic, the result Min-Reg(R) = χ(D) = 0, clearly holds. Computing the relevant retiming is easy in this case.

3. The digital circuit R (nodes V ) can be replaced by the digital circuit ϕ(Rr) (nodes ϕ(V ))

without loss of computational power. It is however necessary to ask if the mapping θ : V ×Z+→

ϕ(V ) × Z+, defined in (2), is simple enough. Actually, it follows from (5) that the mapping θ is elementary.

(11)

4

Solution for an example

In this section, we go through a small example to show how the construction works.

Let us consider the digital circuit displayed in Figure 4.a) Functional gates are represented by nodes with funny shapes and registers by boxes. The initial number of registers is 5. Here, retiming alone does not help to reduce the number of registers. Following our algorithm, the

f1 f2 f3

f4

f1 f2 f3

f4

f1

a) A digital circuit with 5 registers c) The new circuit has 4 registers

b) The associated unfolding, its maximal flow and the corresponding positive splinter

and node 1 has been duplicated

1 2 3 4

construction of the associated unfolded graph is given in Figure 4.b)

The size of the maximal flow is 4. This means that one can find an equivalent circuit with 4 registers. The shape of the corresponding splinter gives the retiming to be applied. Duplicating node 1 finishes the construction of the circuit, displayed in Figure 4.c) which is equivalent to the initial circuit up to the bijection θ which maps the sequence of values computed in node 3 to the same sequence in the initial circuit, shifted by one.

Acknowledgement

The authors wish to thank Alexander V. Karzanov who provided us with a simple proof of Theorem 2.1 and an anonymous referee who pointed out the work of Orlin on dynamic graphs.

References

[1] V. Adlakha and V. Kulkarni. A classified bibliography of research on stochastic PERT networks: 1966-1987. INFOR, 27(3):272–296, 1989.

(12)

[3] B. Gaujal, A. Jean-Marie, and J. Mairesse. Computations of uniform recurrence equations using minimal memory size. SIAM J. on Computing, 30(5):1701–1738, 2000.

[4] Sabih H. Gerez. Algorithms for VLSI Design Automation. Wiley, 1999.

[5] Gary D. Hachtel and Fabio Somenzi. Logic Synthesis and Verification Algorithms. Kluwer Academic Publishers, Norwell, MA, USA, 2000.

[6] R. Karp, R. Miller, and S. Winograd. The organization of computations for uniform recur-rence equations. Journal of the ACM, 14(3):563–590, 1967.

[7] C. Leiserson and J. Saxe. Retiming synchronous circuitry. Algorithmica, 1991.

[8] G. De Micheli. Synthesis and Optimization of Digital Circuits. Mc Graw Hill, 1994.

[9] J. B. Orlin. Maximum-throughput dynamic network flows. Mathematical Programming, 27:214–231, 1983.

[10] J. B. Orlin. Minimum convex cost dynamic network flows. Mathematics of Operations Research, 9(2):190–207, 1984.

Figure

Figure 1: An illustration to Theorem 2.1.
Figure 2: Duplication of node a
Figure 4: Retimed graph and its unfolding.

Références

Documents relatifs

What these observations suggest is that what matters for the extension of the lived body/body schema in question is not the visual perception of a continuity and of a

Intuitively, the same location may be allocated twice (e.g. N ), but the path of each execution step is unique and so also the source by extension. The semantics also ensures that

Ecole des JDMACS 2011- GT MOSAR Synth` ese de correcteurs : approche bas´ ee sur. la forme observateur/retour d’´ etat MATLAB

While classical timing-based algorithms assume a global timing assumption that is, for each process, on any pair of consecutive operations (even when executed on different objects,

ON THE AVERAGE NUMBER OF REGISTERS NEEDED TO EVALUATE A SPECIAL CLASS OF BACKTRACK TREES (*)..

6 To get a better understanding of concepts like identity, identification and authentication, see [15].. registers’ harmonisation fit together with data protection

We study the fundamental communication properties of two major shared-memory models, namely the ones in which processes communicate via Single-Writer/Multi-Reader (SWMR for

My assumption is that intensifiers that have recently emerged such as totally tend to bear on adjectives or other parts of speech belonging to colloquial language, and