A Characterization of One-to-One Modular Mappings

(1)

HAL Id: hal-02101781

https://hal-lara.archives-ouvertes.fr/hal-02101781 Submitted on 17 Apr 2019

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are

pub-L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non,

A Characterization of One-to-One Modular Mappings

Alain Darte, Michèle Dion, Yves Robert

To cite this version:

Alain Darte, Michèle Dion, Yves Robert. A Characterization of One-to-One Modular Mappings. [Research Report] LIP RR-1995-09, Laboratoire de l’informatique du parallélisme. 1995, 2+13p. �hal-02101781�

(2)

Laboratoire de l’Informatique du Parallélisme

Ecole Normale Supérieure de Lyon

Unité de recherche associée au CNRS n°1398

A Characterization of One-to-One Modular Mappings Alain Darte Michele Dion Yves Robert

April 1995

Research Report No 95-09

(3)

Alain Darte

Michele Dion

Yves Robert

April 1995

Abstract

In this paper, we deal with modular mappings as introduced by Lee and Fortes 14, 13, 12], and we build upon their results. Our main contribution is a characterization of one-to-one modular mappings that is valid even when the source domain and the target domain of the transformation have the same size but not the same shape. This characterization is constructive, and a procedure to test the injectivity of a given trans-formation is presented.

Keywords:

automatic parallelization, loop nests, time-space transformation, modular mapping, injectivity, characterization

Resume

Nous etudions dans ce rapport les placements modulaires tels qu'ils ont et e introduits par Lee et Fortes 14, 13, 12], et nous d eveloppons les r esultats qu'ils ont obtenus. Notre apport principal consiste en une caract erisation des placements modulaires bijectifs qui reste valide même lorsque les domaines source et cible de la transformation contien-nent le même nombre de points mais n'ont pas la même forme. La caract erisation est constructive et nous pr esentons une proc edure qui permet de tester l'injectivit e d'une transformation.

Mots-cles:

parall elisation automatique, nids de boucles, transformation temps-espace, placement modulaire, injectivit e, caract erisation

(4)

Alain Darte, Michele Dion and Yves Robert

Laboratoire LIP, URA CNRS 1398

Ecole Normale Sup erieure de Lyon, F - 69364 LYON Cedex 07

e-mail: Alain.Darte,Michele.Dion,Yves.Robert]@lip.ens-lyon.fr

Abstract

In this paper, we deal withmodular mappingsas introduced by Lee and Fortes 14, 13, 12], and we build upon their results. Our main contribution is a characterization of one-to-one modular mappings that is valid even when the source domain and the target domain of the transformation have the same size but not the same shape. This characterization is constructive, and a procedure to test the injectivity of a given transformation is presented.

1 Introduction

Recently, Lee and Fortes 14, 13, 12] have introduced modular mappings in the context of systolic array design methodologies and parallelizing compilation. Their idea is to extend ane mapping techniques by using linear transformations modulo a constant vector. Ane mappings are time-space transformations that have been used extensively by a variety of researchers to derive ecient time-space transformations for loop nest programs (see 1, 17, 6, 5, 7, 8, 20, 11, 15, 19, 21] among others).

However, the systematic derivation of programs that can take advantage of wraparound connec-tivity in networks such as rings and 2D- or 3D-torus remains out of the scope of ane mappings. A typical example is Cannon's matrix-matrix product algorithm on a 2D-torus of processors 3]: this well-known algorithm (whose counterpart in the systolic eld is the Preparata-Vuillemin 2D-systolic array 18]) cannot be synthesized using ane transformations, whereas Lee and Fortes 14, 13] demonstrate how to synthesize it, as well as many interesting variants, using one-to-one modular mappings. We point out that many other BLAS3-like kernels have been implemented onto 2D processor meshes using wraparound connections (e.g. the scientic library of the MasPar 2, 4]). We refer to Section 2 for the automatic synthesis of Cannon's algorithm using modular mappings, thereby providing the reader with a complete example to demonstrate the usefulness of modular mappings.

This paper deals with the automatic derivation of one-to-one modular mappings. We build upon the results of Lee and Fortes, which we summarize in Section 3. In a word, Lee and Fortes give several sucient conditions for a modular mapping to be one-to-one. Injectivity plays a key role as modular mappings represent a time-space transformation from an index domain (computation points) to a target domain: clearly, the number of computation points must be preserved by the mapping. There are two major limitations in the results of Lee and Fortes:

(5)

i/j 0 1 2 3 4 0

d

0 0

=e

0 0

d

0 1

=e

1 1

d

0 2

=e

2 2

d

0 3

=e

3 3

d

0 4

=e

4 4 1

d

1 1

=e

1 0

d

1 2

=e

2 1

d

1 3

=e

3 2

d

1 4

=e

4 3

d

1 0

=e

0 4 2

d

2 2

=e

2 0

d

2 3

=e

3 1

d

2 4

=e

4 2

d

2 0

=e

0 3

d

2 1

=e

1 4 3

d

3 3

=e

3 0

d

3 4

=e

4 1

d

3 0

=e

0 2

d

3 1

=e

1 3

d

3 2

=e

2 4 4

d

4 4

=e

4 0

d

4 0

=e

0 1

d

4 1

=e

1 2

d

4 2

=e

2 3

d

4 3

=e

3 4

Figure 1: Initial data alignement

they only deal with modular transformations that map an index domain onto itself. In other words, the target domain is assumed to be the same as the index domain. Clearly, if the transformation is one-to-one, the index domain and the target domain should have the same size, but not necessarily the same shape,

they only give sucient conditions for a transformation to be one-to-one. Given an arbitrary modular mapping (possibly given by the programmer), it is not always possible to decide from their results whether the transformation is one-to-one or not. Necessary and sucient conditions would be necessary. Also, a procedure to determine whether a given transformation is one-to-one would be highly desirable.

Our paper overcomes both limitations. Our main result is a necessary and sucient condition for a modular mapping to be one-to-one. The condition is rather technical, but the proof is constructive, hence a procedure to accompany the systematic derivation of one-to-one modular mappings. The condition extends to mappings for which the index domain and the target domain have the same size but not the same shape.

The rest of the paper is organized as follows: in Section 2 we detail the use of modular mappings through the matrix-matrix product example. In Section 3 we formally dene modular mappings, and we review the results obtained by Lee and Fortes. In Section 4 we give a necessary and sucient condition for a modular mapping to be one-to-one. We discuss several extensions in Section 4.2. Section 5 is devoted to some nal remarks and conclusions.

2 Why modular mappings ?

Several basic computational kernels require other type of transformations than ane time-space mappings. A well-known example is Cannon's algorithm for matrix-matrix multiplication (see 3]): DO

i

= 0

4

DO

j

= 0

4 DO

k

= 0

4

c

(

ij

) =

c

(

ij

)+

d

(

ik

)

e

(

kj

)

CONTINUE

In Cannon's algorithm, the data arrays

d

and

e

are rst aligned and multiplied (elementwise) by each other as shown in Figure 1. The result of each mutiplication is stored in

c

(

ij

). At the next step, matrix

d

is shifted to the left and matrix

e

is shifted up. Elementwise multiplication takes

(6)

place and the result is added to the values of

c

(

ij

). The processus is repeated until all elements in a row of

d

are multiplied by all elements in a column of

e

.

Let us consider the following transformation

T

b:

T

b((

ijk

)t) = ( 0 B @ ;1 ;1 1 1 0 0 0 1 0 1 C A 0 B @

i

j

k

1 C A)mod(5 5 5)= 0 B @ ;

i

;

j

+

k

mod 5

i

mod 5

j

mod 5 1 C A

This transformation, called modular mapping in 14, 13], transforms the previously described program into an equivalent one:

DO

t

= 0

4 DOALL

p

1= 0

4 DOALL

p

2= 0

4

i

=

p

1

j

=

p

2

k

= (

t

+

p

1+

p

2)mod5

c

(

ij

) =

c

(

ij

)+

d

(

ik

)

e

(

kj

)

MOVE WEST(d) MOVE NORTH(e) CONTINUE

Cannon's algorithm (except data movement) can therefore be described by a modular transfor-mation applied to the original program. We refer the reader to the original papers of Lee and Fortes 14, 13] for several interesting variants of this standard parallelization, as well as for a method to derive data communications.

3 Review of Lee and Fortes results

3.1 Denitions

In this section, we use the same denitions and notations as in Lee and Fortes 14, 13]. Let

u

= (

u

1

:::u

n)t 2 Zn be a vector with

n

integer components, and let

m

= (

m

1

:::m

n)t 2

(N )n be a vector with

n

positive integer components. The notation

u

modm denotes the vector

(

u

1mod

m

1

:::u

nmod

m

n)t.

Denition 1

(Modular function) A modular function

T

m : Zn ;! Zn is dened as

T

m(

p

) =

(

Tp

)modm for

p

2Zn, where

T

is a

n

integer matrix (thetransformation matrix) and

m

2(N )n

is a

n

-vector (the modulus vector).

Denition 2

(Modular time-space transformation of an index domain) A modular time-space transformation,

T

m, is a modular function that is injective when its domain is restricted to the index set

J

of an algorithm, i.e,

T

m :

J

!Znis injective.

Denition 3

(Rectangular index set and boundary vector) An index set

J

is rectangular and de-noted

J

b if

J

=f

p

2Zn

0

p < b

g, where inequalities between

n

-vectors are taken componentwise.

The vector

b

is called the boundary vector of

J

b.

Denition 4

(Smith normal form) For every matrix

A

ofZn, there exist two unimodular matrices

(7)

S

=diag(

s

1

s

2

s

r

0

0)

where

r

is the rank of

A

,

s

1

s

2

s

r are non-zero elements ofZand

s

ij

s

i+1

1

i < r

.

A

=

Q

1

SQ

2

The matrix

S

is then denoted by

S

(

A

).

Denition 5

(Left Hermite form) For every non singular matrix

A

ofZn, there exist a unimodular

matrix

Q

and a lower triangular matrix

H

such that:

8(

ij

)

h

ij

>

= 0,

each non-diagonal element is lower than the diagonal element of the same row,

A

=

HQ

.

Besides, this decomposition is unique up to a permutation of the rows. In fact, the row order used to \triangularize"

A

into

H

is arbitrary, hence there are

n

! left Hermite forms.

An important remark

Consider the modular transformation

T

m with transformation matrix

T

and modulus vector

m

. It is important to point out that the coecients of

T

are dened only up to a modulus operation. More precisely, let

T

0 be the new transformation matrix dened by

T

0 = (

t

0

ij) where

t

0

ij =

t

ij mod

m

i: then

T

0

m =

T

m. The proof is immediate: 8(

x

1

x

n) 2 Zn

P

j

t

ij

x

j mod

m

i =P

j

t

0

ij

x

j mod

m

i.

Similarly, the determinant of

T

is dened modulo the product

d

=Qn

i=1

m

i. In particular, we can always assume that

T

is non singular (add a suitable multiple of

d

to each diagonal element

t

ii, say, to get an equivalent non singular transformation matrix).

3.2 Main results of Lee and Fortes

In 14, 13], Lee and Fortes restrict themselves to the study of modular mappings for which the modulus vectoris equal to the boundary vector, i.e.

m

=

b

. The case where

m

=

b

is very important in practice, as the matrix-matrix product example demonstrates.

Lee and Fortes start with the following lemma:

Lemma 1

Let

J

b =f

p

2Zn

0

j < b

g be a rectangular domain and dene ^

J

b =f

p

2Zn

;

b <

p < b

g. A modular function

T

b:

J

b ;!Znis injective if and only if

T

b(

p

)6= 0 for all

p

2

J

^b except

p

= 0.

Proof

See 13]. If

T

b is not injective, there exist two distinct points

pq

2

J

b such that

T

b(

p

) =

T

b(

q

). Then

r

=

p

;

q

2

J

^b,

r

6= 0 and

T

b(

r

) = 0.

Conversely, if there exists

r

2

J

^b,

r

6= 0 and

T

b(

r

) = 0, let

p

be dened as follows:

p

i =

r

i if

0

r

i

< b

i and

p

i = 0 if ;

b

i

< r

i

<

0. Let

q

=

p

;

r

. Then

pq

2

J

b,

p

6=

q

and

T

b(

p

) =

T

b(

q

),

hence

T

b is not injective.

Then, Lee and Fortes deal with generator matrices. They consider the set of integer points that are equivalent to zero, i.e, the equivalence class

S

0₌

(8)

They prove that

S

0 _{is a module, and that there exists a}

_n

n

integer matrix

G

that generates

S

0_{: this means that every element of}

_S

0 _{can be represented as an integer linear combination of}

the columns of

G

. Of course, there are several matrices that generate

S

0_{, but they all are right}

equivalent. Indeed, let

G

be a generator matrix, then a matrix

G

0 will generate

S

0 if and only if

there exists a

n

unimodular matrix

U

such that

G

0=

GU

.

The main contribution of 14, 13] on generators is a sucient condition on the generators of

S

0

that guarantees the injectivity of the transformation:

Lemma 2

Let

J

b be a rectangular index set with boundary vector

b

. Let

T

b be a modular mapping and let

G

be a generator of

S

0_{. Let} be an arbitrary order on the set f1

2

n

g.

T

b is injective

if

G

satises the following equations: 1.

g

ii=

b

ii,

2.

g

ij = 0 if

i

j

.

From this sucient condition on generators, Lee and Fortes investigate the relationship be-tween generator matrices

G

and transformation matrices

T

. They deduce the following sucient conditions for a modular mapping

T

b to be injective:

Theorem 1

Let

J

bbe a rectangular index set with boundary vector

b

. Let

T

b be a modular mapping. Let be an arbitrary order on the set f1

2

n

g.

T

b :

J

b ;! Zn is injective if the matrix

T

satises to the following equations: 1.

t

ii^

b

i = 11

2.

t

ij = 0 if

i

j

We restate Theorem 1 as follows: if

T

is triangular up to a permutation, and if its

i

-th diagonal entry is relatively prime with

b

i for all 1

i

n

, then

T

b is a time-space transformation of

J

b.

It turns out that Theorem 1 can be proven without making use of generators, as shown by the following direct proof.

Another proof of Theorem 1

Let

T

be upper triangular (without loss of generality). We solve the system

Tx

= 0 mod

b

for

x

2

J

^b, where

T

is upper triangular. The last equation is

t

nn

x

n= 0 mod

b

n

hence

b

n divides

t

nn

x

n. Since

b

n is relatively prime with

t

nn,

b

n divides

x

n, which implies

x

n = 0 as

x

2

J

^b. The (

n

;1)-th equation gives

t

n;1n;1

x

n

;1+

t

n;1n

0 = 0 mod

b

n ;1

hence

x

n;1= 0 just as before, and continuing the process we nd

x

= 0.

Therefore we do not need to make use of generator matrices to prove Theorem 1. However, generator matrices will enable us to characterize one-to-one modular mappings, i.e. to give a necessary and sucient condition for injectivity, as we show below in Section 4.

Finally, Lee and Fortes give a necessary and sucient condition in the case where all entries of the boundary vector

b

have the same value

: in this particular case, they show that

T

bis injective on

J

b if and only if the determinant of

T

and

are relatively prime. We extend this result in Section 4.2.1.

(9)

4 New results

4.1 Characterization when

m = b

In this section, we consider as Lee and Fortes that

m

=

b

(the modulus vector is equal to the boundary vector). For the general case, see Section 4.2.

As we have seen in Lemma 2, Lee and Fortes 14] exhibit sucient conditions on generator matrices

G

to obtain one-to-one modular mappings. We prove here that these conditions are also sucient, and we give a constructive method to check the injectivity of a given modular mapping. In the following, we denote by the matrixdiag(

b

1

b

n).

Lemma 3

If

G

is a generator matrix of

S

0_{, then} _det(

_G

_{) divides det() =}

_b

₁

_b

₂

b

n.

Proof

We are going to exhibit a generator matrix for

S

0_{. Let us consider a point}

_p

2

S

0: 9

k

2Zn

such that

Tp

=

k

. Let =diag(i) be the comatrix of and

d

= det = Qn

i=1

b

i ( is dened

such that =

d

I

nwhere

I

nis the identity matrix of order

n

, i = Q

j6=i

b

j). We have:

Tp

=

k

Tp

=

dk

Q

1

S

(

T

)

Q

2

p

=

dk

where

Q

1,

Q

2 are two unimodular matrices and

S

(

T

) is the Smith normal form of matrix

T

.

S

(

T

)

Q

2

p

=

dQ

1;1

k

S

(

T

)

Q

2

p

=

dk

0 where

k

0 2Zn(

Q

1is unimodular). Let

S

(

T

) =diag(

s

i) :

s

i(

Q

2

p

)_i =

dk

0 i

We want only integer values for the components of

p

, therefore (

Q

2

p

)_i= _gcd(

d

_ds

_i₎

k

00 i where

k

00 2Zn.

p

=

Q

2;1

S

0

k

00 where

S

0= diag(gcd(dd s i)). The matrix

Q

2 ;1

S

0generates

S

0.

Besides, if we let

S

() =diag( 0 i) and

S

() = diag( 0 i), we know that 0 i0 n;i+1 =

d

(see 16] p.40) and that 0

i divides

s

i (if

A

and

B

are two nonsingular integer

n

matrices, then the

k

-th

element

s

k(

AB

) of the Smith normal form of

AB

is divisible by

s

k(

A

) and

s

k(

B

), see 16] p.33). 0 i divides

s

i and

d

, so gcd(

ds

i) = 0 i

u

i where

u

i 2Zand

s

0 i = 0d iu i. Therefore det(

S

0) = Q d 0 iu i det(

S

0) = d n det()Q ui

(10)

det(

S

0

) = Q

d

u

i

Since all generator matrices are right equivalent, they all have the same determinant as

Q

2;1

S

0,

hence as

S

0.

Lemma 4

Let

G

be a nite abelian group. Let

g

]q be the subset f0

g:::

(

q

;1)

g

g with

g

2

G

and 1

< q

order

(

g

). Let

S

1

:::S

k be

k

subsets of

G

.

G

is said to be the direct sum of the

S

i,

which we denote as

G

=

S

1

S

k, if the mapping (

g

1

:::g

k)7!

g

1++

g

k from

S

1

S

k

to

G

is one-to-one. If

G

=

g

1]k1

g

r]k

r then at least one of the

g

i]ki is a subgroup of

G

.

Proof

This result has been proved by Haj os in its works on one of the Minkowski's conjecture, see 9].

Lemma 5

If

T

b is a one-to-one modular mapping on

J

b, then8

x

2Zn, there exists(

x

1

x

2)2

S

0

J

b

such that

x

=

x

1+

x

2, and this decomposition is unique.

Proof

We rst prove the existence of such a decomposition and then its uniqueness.

Existence

Let us consider the nite abelian group

A

=Zn

=S

0. The matrix

Q

2 ;1

S

0 generates

S

0

(see Lemma 3), so the number of elements of

A

is det(

Q

2;1

S

0) = det(

S

0). For

x

2Zn, we denote

by

x

the canonical image of

x

in

A

.

Let us consider two distinct elements of

J

b,

x

and

y

, then

x

6=

y

. Indeed,

x

;

y

2

J

^b, if

x

=

y

,

x

;

y

2

S

0and

T

bwould be not injective (see Lemma 1). All elements of

J

b have distinct canonical

images in

A

.

The number of elements in

J

b is det(), there are more elements in

J

b than in

A

(det(

S

0)

det(), see Lemma 3). So, for all

x

2

A

, there exists

y

2

J

b such that

x

is the canonical image of

y

(otherwise, two elements in

J

b would have the same canonical image in

A

, and this is impossible). Besides, this also means that if

T

b is injective, we have det

S

0= det.

Consider

x

2 Zn.

x

in

A

there exists

x

2 2

J

b such that

x

2 =

x

,

x

;

x

22

S

0. So, there exists (

x

1

x

2)2

S

0

J

b such that

x

=

x

1+

x

2.

Uniqueness

If there exists (

x

1

x

2) and (

x

0

1

x

0 2) in

S

0

J

b such that

x

=

x

1+

x

2=

x

0 1+

x

0 2 then,

x

2;

x

0 2 =

x

1;

x

0 12

S

0.

x

2;

x

0 22

J

^b and

T

b is injective, so

x

2 =

x

0 2 and

x

1=

x

0 1.

Lemma 6

If

T

b is a one-to-one modular transformation then there exists

i

, 1

i

n

, such that

O

+

b

i

~e

i2

S

0 _(where

_e

_i _{is the}

_i

_{-th vector of the canonical basis of}

Zn).

Proof

Let

f

1

f

2

:::f

nbe the canonical images of

O

+

~e

1

O

+

~e

2

:::O

+

~e

nin

A

. If

T

bis injective, for all

x

2Zn, there exist

a

2

S

0 and integers

i, 0

i

< b

i, such that

x

=

a

+

1

~e

1++

n

~e

n

and this decomposition is unique, see lemma 5. This means exactly that

A

=

f

1]b1

f

n]b n.

Lemma 4 shows that one of the

f

i satises

b

i

f

i= 0, i.e, 0 +

b

i

~e

i 2

S

0.

Theorem 2

A transformation

T

b is one-to-one if and only if there exists a left Hermite form of a generator matrix

G

of

S

0 _{with diagonal}

_b

₁

_b

₂

_:::b

_n_.

(11)

Proof

The sucient condition has already been proved in 14], see Lemma 2.

Necessary condition

Let us consider a one-to-one modular mapping

T

b. Lemma 6 gives an index

i

such that

O

+

b

i

~e

i 2

S

0.

Let us consider

f

i]bi, subgroup of

A

. Let

B

be the nite abelian group

B

=

A=

f

i]bi. We

also know that

A

=

f

1]b1

f

n]b n. Let

f

0 1

f

0 2

f

0 i;1

f

0 i+1

f

0

nbe the canonical images of

f

1

f

2

f

i

;1

f

i+1

f

n in

B

(the canonical image of

f

i in

B

is 0).

Let us consider

x

2

B

, there exists

y

2

A

such that

x

y

in

B

. Besides,

y

=

m

1

f

1++

m

n

f

nand this decomposition is unique. So,

x

=

m

1

f

0 1++

m

i ;1

f

0 i;1+

m

i+1

f

0 i+1+ +

m

n

f

0

n and the decomposition is also unique. This means that

B

=

f

0

1]b1

f

0 i;1]bi;1

f

0 i+1]bi+1

f

0 n]bn. There exists

j

6 =

i

such that

b

j

f

0

j = 0, i.e, there exists

t

, 0

t < b

i such

that 0 +

b

j

~e

j+

t~e

i 2

S

0_.

By repeating this process, we obtain

n

vectors in

S

0_{and the matrix}

_H

_{formed by these vectors}

is upper triangular with diagonal

b

1

b

2

b

n(up to a permutation of indices).

Furthermore, these vectors form a basis of the module generated by

G

. By adding suitable com-binations of the vector columns of

H

to any vector

x

2

S

0, we get

x

=

H

+

a

where8

a

i

0

a

i

< b

i.

Because there is only one element of

S

0 _in

_J

_b _{(which is 0), we have}

_a

_{= 0 and thus,}

_x

_{is a linear}

combination of the column vectors of

H

. Furthermore, this decomposition is unique because

H

is non-singular. So,

H

is the matrix of a basis of

S

0_{, this means that there exists a unimodular}

matrix

Q

such that

G

=

HQ

, and this completes the proof.

Given a transformation matrix

T

and a modulus vector

b

, Theorem 2 gives a constructive method to check whether

T

b is a time-space transformation or not. We sketch the procedure and run it on an example.

Procedure

From Theorem 2, a procedure to know whether a modular transformation is injective or not can be deduced:

1. Calculate the Smith normal form

T

and then deduce the matrix

Q

;1

2

S

0 that generates

S

0

(we use the same notations as in lemma 3).

2. Calculate the

n

! left Hermite normal forms (by permuting the rows) of

Q

;1

2

S

0.

3. If there exists a left Hermite normal form of

Q

;1

2

S

0with diagonal

b

1

b

2

b

n, the

transfor-mation

T

b is injective.

Example

Let us consider the matrix

T

=

0 B @ 1 0 3 1 1 2 3 3 1 1 C

A and the vector

b

= 0 B @ 5 4 6 1 C A.

We calculate the Smith normal form of

T

and we deduce the two matrices

S

0 and

Q

;1 2 :

S

0= 0 B @ 60 0 0 0 2 0 0 0 1 1 C Aand

Q

;1 2 = 0 B @ 3 152 ;613 0 0 1 1 51 ;204 1 C A,

Q

;1 2

S

0= 0 B @ 180 304 ;613 0 0 1 60 102 ;204 1 C A.

We calculate the 6 left Hermite forms of

Q

;1

2

S

0 by permuting the rows. The left Hermite form

of 0 B @ 60 102 ;204 180 304 ;613 0 0 1 1 C Ais 0 B @ 6 0 0 2 5 0 2 3 4 1 C A.

T

b is injective.

(12)

In this section, we start by proving a useful property that allows to restrict the search of one-to-one modular mappings to a more restricted set: we show that a transformation

T

b is injective on

J

b if and only if

T

b in injective on

J

b and

det

(

T

)^

= 1. Then, we consider the particular case where 8(

ij

)

b

i^

b

j = 1. In this particular case, we have a necessary and sucient condition directly with

the transformation matrix. Finally, we extend the results given in Section 4 to a more general case:

m

6=

b

, but Q

m

i =Q

b

i.

4.2.1 Injectivity of

T

b

In this section, let

T

be a transformation matrix and

b

a modulus vector. We still assume that the source domain and the target domain are the same. We will prove a \scalability property". Beforehand, we prove the following lemma as a prerequisite for Theorem 3.

Lemma 7

det(

T

)^

6= 1)

T

bis not injective on

J

b.

Proof

We use the notations of lemma 3. Let

d

= det = Qn

i=1

b

i,

S

(

T

) = diag(

s

i),

S

() = diag(

0

i) and

S

() = diag( 0

i). Let

Q

2 and

S

0 be the matrices such that

Q

2;1

S

0 generates the

module

S

0 _for

_T

_b_,

_Q

0

2 and

S

00 the matrices such that

Q

0

2;1

S

00 generates the module

S

0for

T

b(

Q

2,

S

0,

Q

0

2 and

S

00 are calculated as in the proof of lemma 3).

Let

S

0= diag(

s

0 i) and

S

00= diag(

s

00 i).

S

(

n;1

T

) =

n;1

S

(

T

), so we have:

s

00 i = _gcd(

n

_d

n

d

n;1

s

i)

s

00 i = _gcd(

d

_ds

_i₎ We have seen in the proof of lemma 3 that 0

i divides

s

i. So, we can write

s

i = 0

i

x

i with Q

x

i =

det

(

T

) (Q

s

i= det(

T

) =

d

n;1 det(

T

) = Q 0 iQ

x

i =

d

n;1 Q

x

i). Besides, 0 i= 0 d n;i+1. Thus,

s

00 i = 0

d

igcd(

0 n;i+1

x

i)

There exists

i

such that gcd(

x

i) 6= 1 ( Q

x

i =

det

(

T

) and det(

T

)^

6= 1). Thus, Q

s

00 i

<

Qb 0 i

=

n

_d

_{. Hence,}

_T

_b_{cannot be injective (we see in the proof of Lemma 5 that if}

_T

_b_{is injective,} we must have

det

(

S

00) = det(

) =

n

d

).

Theorem 3

Let

2 N. The modular mapping

T

b is a time-space transformation of

J

b if and

(13)

Proof

Assume that

T

b is injective. Let

p

2

J

^b such that

T

b(

p

) = 0. Equivalently,

Tp

=

k

for

some

k

2Zn. Then

Tp

=

k

and

p

2

J

^b. As

T

p is injective, Lemma 1 implies that

p

= 0.

Hence,

T

b is injective. Besides, we know from Lemma 7 that det(

T

)^

= 1.

Conversely, assume now that

T

b is injective and det(

T

)^

= 1. If det(

T

) = 0, the proof is

immediate (det(

T

) = 0 and det(

T

)^

= 1)

= 1). Consider now the case det(

T

)6= 0.

Let

p

in ^

J

b such that

Tp

=

k k

2Zn

Let

U

be the comatrix of

T

(

UT

= det(

T

)

I

n). We have: det(

T

)

p

=

U

k

det(

T

)

p

i=

(

U

k

)i

So,

divides det(

T

)

p

i. But, we also have det(

T

)^

= 1. Hence,

divides

p

i. Let us consider

q

such that

p

i=

q

i.

det(

T

)

q

=

U

k

det(

T

)

q

=

U

k

So, det(

T

)

Tq

= det(

T

)

k

and

Tq

=

k

. Besides, ;

b < p < b

implies ;

b < q < b

.

T

b is

injective, thus

q

= 0 and

p

= 0.

Remark

The previous theorem leads to another proof of the following result of Lee and Fortes: in the case where all entries of the boundary vector

b

have the same value

,

T

b is injective on

J

b if and only if the determinant of

T

and

are relatively prime. Indeed, if

b

= (1

1)t,

J

b contains

only the point (0

0) and

T

b is always injective. Therefore

T

bis injective i det(

T

)^

= 1.

4.2.2 When

8(

ij

)

b

i^

b

j = 1

We know (see Section 3.1) that for any transformation

T

m, there exists an equivalent transformation denoted as

T

0

m and such that8(

ij

)

0

t

0

ij

< b

i. We still assume

m

=

b

here. We prove that if

8(

ij

)

b

i^

b

j = 1, then

T

b is one-to-one i

T

is triangular (up to a permutation) with \good"

diagonal coecients: in this particular case, the characterization of one-to-one mappings is quite simple.

Theorem 4

If8(

ij

)

b

i^

b

j = 1, then

T

b is injective on

J

b if and only if

T

0

b is an upper triangular matrix (up to a permutation on row and column indices) with 8i

t

ii^

b

i= 1 .

Proof

The sucient condition has been proved in 13] (see Theorem 1) .

Necessary condition

Assume that the transformation

T

b is injective. The proof uses the same lemma as Theorem 2. Let us consider

t

0

j

1

j

n

, the columns of

T

0 and let

A

be the group Z

=b

1ZZ

=b

2ZZ

=b

nZ. The restriction of

T

b to

J

b denes an injective application. The two

sets have the same number of elements, so it is also bijective, i.e,8

x

2

A

9!

y

2

J

b

= x

=

T

b(

y

) =

(

Ty

)modb. This means exactly that

A

=

t

0

1]b1 b

t

0 2]b2 bb

t

0 n]bn.

(14)

Lemma 4 shows that there exists

j

such that

t

0

j]bj is a subgroup of

A

. There exists

j

such

that

t

0

j

b

j = 0modb, i.e 8

it

0

ij

b

j = 0 mod

b

i. So, we must have 8

i

6=

jt

0

ij = 0 since

b

i^

b

j = 1 and

t

0

jj^

b

j = 1 (otherwise the transformation would not be injective).

Up to a permutation on rows and columns, the matrix

T

0 is now: 0 B B B @

t

0 jj

u

0

:

T

00 0 1 C C C A (where

u

is a row vector of

n

;1 elements). Consider the new modular transformation

T

00

b0, where

b

0 =

(

b

1

b

j

;1

b

j+1

b

n)t. Let us prove that

T

00

b0 is an injective modular transformation on

J

b 0. Let us consider

x

= (

x

1

x

j ;1

x

j+1

x

n)2

J

^b 0 such that

T

00

x

= 0 modb0. The element

t

0 jj has an inverse in Z

=b

jZ(

t

0

jj^

b

j= 1). Let

2Z

=b

jZbe the value ;

t

0 jj;1

u:x

, where

t

0 jj;1 is the inverse of

t

0 jj in Z

=b

jZ. We have

t

0

jj

+

u:x

= 0 mod

b

j. So, the vector (

x

1

x

j

;1

x

j+1

x

n)2

S

0 and we have8

i

6=

jx

i= 0 (

T

b is injective).

We have just proved that

T

00

b0 is injective. So, in the same way, there exists

k

such that

T

00 = 0 B B B @

t

0 kk

: : :

0

:

T

000 0 1 C C C A with

t

0 kk^

b

k = 1, up to a permutation.

By repeating this process, we obtain that

T

0is triangular up to a permutation, and each element

t

0

ii on the diagonal satises

t

0

ii^

b

i= 1.

4.2.3 Extension to the general case:

m

6=

b

In this section, we consider a modular transformation

T

m and a rectangular index set

J

b and we prove that general results can be easily derived from the particular case

m

=

b

.

First, Lemma 1 remains satised in the general case.

Lemma 8

A modular function

T

m :

J

b ;! Znis injective if and only if

T

m(

p

) 6= 0 for all

p

2

J

^b

except

p

= 0.

Proof

The proof is immediate from the proof of Lemma 1.

Besides, if we consider the set of integer points that are equivalent to zero,

S

0 ₌ f

p

2 Zn

T

m(

p

) = 0g, we can nd in the same way a generator matrix for

S

0. Let = diag(

m

i).

As in Section 4 and with the notations of lemma 1, we obtain a matrix

Q

2;1

S

0that generates

S

0

and that satises det(

Q

;1

2

S

0)

jdet() (simply replace

b

by

m

in Lemma 3).

Lemma 9

IfQn

i=1

m

i=

Qn

i=1

b

i, then Theorem 2 remains valid: a transformation

T

m is one-to-one

if and only if there exists a left Hermite form of a generator

G

of

S

0 _{with diagonal}

_b

₁

b

n.

Proof

The proof is immediate from the proof of lemma 2. The condition thatQn

i=1

m

i =Qn

i=1

b

i is needed to prove that the sum of subsets used in Lemma 5 is a direct sum. Of course, if

T

m is one-to-one from

J

b onto

J

m, both domains must have the same number of integer points.

In 13], Lee and Fortes dealt with the particular case when the modulus vector results from a permutation of the entries of the boundary vector. Lemma 9 is an extension of this particular case.

(15)

Example

Let us consider the matrix transformation

T

=

1 1 1 2

!

and the modulus vector

m

= 2 3 ! . We have

Q

2;1

S

0 = 6 1 0 1 !

. Thus,

T

m is injective on the rectangular index set

J

(6 1)t but is not injective on

J

(2 3)t.

Lemma 9 is very useful as it enables to check injectivity for transformations that map a given rectangular domain onto a domain of dierent shape (but of the same size).

5 Conclusion

In this paper, we have considered modular mappings as introduced by Lee and Fortes 14, 13, 12]. Our main contribution is a characterization of one-to-one modular mappings that is valid even when the source domain and the target domain of the transformation have the same size but not the same shape. This characterization is constructive, and a procedure to test the injectivity of a given transformation has been presented.

We believe the study of modular mappings to be very promising in the context of automatic parallelization techniques. Indeed, mapping techniques usually proceed in two steps: rst the input domain (computation points) is mapped onto a time-space domain where a virtual processor is assigned to each computation. Then virtual processors are mapped onto physical processors, most often using a block-cyclic allocation a la HPF 10]. Characterizing valid modular mappings from input domains onto target domains of larger dimension would enable to fully automatize the mapping procedure.

References

1] Jennifer M. Anderson and Monica S. Lam. Global optimizations for parallelism and locality on scalable parallel machines. ACM Sigplan Notices, 28(6):112{125, June 1993.

2] T. Blank. The MasPar MP-1 architecture. In Compcon Spring, pages 20{24. IEEE Press, February 1990.

3] L.E. Cannon. A cellular computer to implement the Kalman lter algorithm. PhD thesis, Montana State University, 1969.

4] P. Christy. Software to support massively parallel computing on the MasPar MP-1. In Compcon Spring, pages 29{33. IEEE Press, February 1990.

5] Alain Darte and Yves Robert. Constructive methods for scheduling uniform loop nests. IEEE Trans. Parallel Distributed Systems, 5(8):814{822, 1994.

6] Alain Darte and Yves Robert. Mapping uniform loop nests onto distributed memory architec-tures. Parallel Computing, 20:679{710, 1994.

7] Paul Feautrier. Some ecient solutions to the ane scheduling problem, part I, one-dimensional time. Int. J. Parallel Programming, 21(5):313{348, October 1992. Available as Technical Report 92-28, Laboratoire MASI, Universit e Pierre et Marie Curie, Paris, May 1992.

(16)

8] Paul Feautrier. Towards automatic distribution. Parallel Processing Letters, 4(3):233{244, 1994.

9] G. Hajos. Uber einfache und mehrfache bedeckung des n-dimensionalen raumes mit einem wurfelgitter. Math. Zeitschrift, 47:427{467, 1942.

10] Charles H. Koelbel, David B. Loveman, Robert S. Schreiber, Guy L. Steele Jr., and Mary E. Zosel. The High Performance Fortran Handbook. The MIT Press, 1994.

11] S.Y. Kung. VLSI array processors. Prentice-Hall, 1988.

12] Hyuk J. Lee and Jos e A.B. Fortes. Data distribution independent parallel programs for ma-trix multiplication. Technical Report TR-EE 94-32, School of Electrical Engineering, Purdue University, October 1994.

13] Hyuk J. Lee and Jos e A.B. Fortes. Modular mappings of rectangular algorithms. Technical Report TR-EE 94-22, School of Electrical Engineering, Purdue University, June 1994.

14] Hyuk J. Lee and Jos e A.B. Fortes. On the injectivity of modular mappings. In Peter Cappello, Robert M. Owens, Jr Earl E. Swartzlander, and Benjamin W. Wah, editors, Application Spe-cic Array Processors, pages 237{247, San Francisco, California, August 1994. IEEE Computer Society Press.

15] D.I. Moldovan and J.A.B. Fortes. Partitioning and mapping algorithms into xed-size systolic arrays. IEEE Transactions on Computers, 35(1):1{12, January 1986.

16] Morris Newman. Integral Matrices. Academic Press, 1972.

17] Michael O'Boyle and G.A. Hedayat. Data alignment: Transformations to reduce communica-tions on distributed memory architectures. In Scalable High-performance Computing Confer-ence SHPCC-92, pages 366{371. IEEE Computer Society Press, 1992.

18] F.P. Preparata and J.E. Vuillemin. Area-time optimal VLSI networks for multiplying matrices. Information Processing Letters, 11(2):77{80, 1980.

19] Patrice Quinton and Yves Robert. Systolic Algorithms and Architectures. Prentice Hall, 1991. Translated from French, Masson (1989).

20] Weijia Shang and A.B. Fortes. Time optimal linear schedules for algorithms with uniform dependencies. IEEE Transactions on Computers, 40(6):723{742, June 1991.

21] Michael E. Wolf and Monica S. Lam. A loop transformation theory and an algorithm to maximize parallelism. IEEE Trans. Parallel Distributed Systems, 2(4):452{471, October 1991.

A Characterization of One-to-One Modular Mappings

A Characterization of One-to-One Modular Mappings

Laboratoire de l’Informatique du Parallélisme

April 1995

Research Report No 95-09

Alain Darte

Michele Dion

Yves Robert

April 1995

Abstract

Keywords:

Resume

Mots-cles:

Alain Darte, Michele Dion and Yves Robert

Laboratoire LIP, URA CNRS 1398

Ecole Normale Sup erieure de Lyon, F - 69364 LYON Cedex 07

1 Introduction

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

d

=e

2 Why modular mappings ?

i

j

k

c

ij

c

ij

d

ik

e

kj

d

Michele Dion

Alain Darte, Michele Dion and Yves Robert