• Aucun résultat trouvé

Inductive & Injective Partitioned Functions

Encoding & validating the model

5.4 Encoding AADTs — Σ DD

5.4.2 Inductive & Injective Partitioned Functions

t {a,b,c} 1

x

Figure 5.8: Example of aΣDDsrewriting

This section presented an informal overview on theSDDsand how to encode a set of terms as aΣDD. It also explained why the rewriting of such a set of terms can be efficient. The next section provides a detailed and formalized description of typed version of theSDDscalled theInjective Partitioned Functions (IPFs)that serves as basis for the formalization of theΣDDsproposed in Section5.4.3.

5.4.2 Inductive & Injective Partitioned Functions

As stated previously, ΣDDsextendSDDs [CTM05]. Because the SDDs formal-ization is un-typed, using them directly would make the ΣDDs formalization rather cumbersome. Indeed, order-sorting calls for a clear definition of the com-patibility between the domains of the variables. Therefore, this section presents the Inductive Injective Partitioned Functions (IIPFs) that are a strongly typed version of the Hierarchical Set Decision Diagram (SDD). IIPFs are basically injective mappings on a partitioned set, the domain of which is inductively defined The objective of this section is to represent finite sets of terms with the minimum amount of information. A term can be represented as a pair hoperator,subtermsi where operator is the function name of the term and subtermsis a vector of terms representing its arguments. Following this approach, a set of terms is a set of such pairs and thus it is a function from the operators to a vector of subterms. In other words, we want to encode unary functions that are inductively defined in an efficient way.

5.4. Encoding AADTs —ΣDD 131 To optimize the representation, the goal is to reduce the number of pairs by grouping together all the values that have the same image. By doing so, one defines an equivalence relation based on the image of the function. Fig. 5.9a represents a standard function in which some pairs (i.e.,hpre-image,imagei) are

“unnecessary” in the sense that they do not bring additional information to the behavior of the function, up to the partitions.

Let us start by reminding basic mathematical definitions. More specifically, the objects we want to encode are bounded lattices of total functions. A bounded lattice is an algebraic structure that admits a minimum and a maximum element with respect to internal operations.

Definition 5.4.1 (Bounded lattices). — An algebraic structure hE,∪L,∩L,>L,⊥Liis a bounded lattice if the following axioms holds:

commutative a∪b=b∪aanda∩b= b∩a;

associative a∪(b∪c)=(a∪b)∪canda∩(b∩c)= (a∩b)∩c;

absorption a∪(a∩b)=aanda∩(a∪b)= a;

idempotent a∪a= aanda∩a= a;

identities a∪ ⊥ =aanda∩ >= a.

Example 5.4.2. —Let E = {1,2,3,4}be a set. The powerset P(E) of E along with the set-union and set-intersection is a bounded lattice: hP(E),∪E,∩E,E,∅i.

Definition 5.4.3 (Partial functions to a lattice). —LetLbe a lattice, a partial function f : D−→| Lfrom the domainDto the latticeLis a function f : D0 →L, whereD0 ⊂ Dis the subset for which f is defined.

As we only encode total function, we extend partial functions to a lattice to total function by associating every undefined elements to the lattice’s infinum.

132 Chapter 5. Encoding&validating the model Definition 5.4.4 (Extension of partial functions to total functions). —Let L be a lattice, and let f : D −→| Lbe partial function from the domain Dto the latticeLwhereD0 ⊂ Dis the domain of f. The extension of f to a total function is defined by:

f :D→ L =





f : x7→ f(x), ∀x∈D0 f : x7→0L, ∀x∈ D\D0

Remark 5.4.1. —From now on, we only consider total functions. If necessary, partial functions are made total by Def. 5.4.4. For the sake of the presentation, in indices, we note the domain of the total functions from Dto L: LD instead of DAL.

Remark 5.4.2. — In the sequel we may consider functions as set of pairs that associate an element of the domain to an image in the co-domain:

[

x∈D

{x7→ f(x)}

This leads us to the definition of the lattice of the total function from a set D to a latticeL.

Definition 5.4.7 (Lattice of total functions). —Given a bounded latticeL, the lattice of the total functions ofDALishDAL,∪LD,∩LD,0LD,>LDiwhere:

(f ∪LD g)(x) = f(x)∪Lg(x) (f ∩LD g)(x) = f(x)∩Lg(x)

Furthermore, 0LD is the empty function, and>LD is the upper partial function s.t.

∀x∈D:>LD(x)= L.

The most natural way to ensure the use of the minimal number of pairs is by enforcing the injectivity up to an equivalence relation. Fig. 5.9b presents the function of Fig. 5.9a where the values that lead (arcs from D to L) to the same image are grouped together. This partitions the domainDinto equivalence classes and thus reduces the number of arcs and the overall number of objects.

5.4. Encoding AADTs —ΣDD 133

0L

L

can be grouped

D

(a)A partial function fromDtoL

0L

P(D) L

(b)A partitioned-injective version of5.9a Figure 5.9: A function and its injective partitioned version

These functions are called injective partitioned functions because they are built upon standard functions and they are injective up to the partitions of the domain of the original function. Let us now define a structure that is equivalent toLon the injective partitioned functions.

An injective partitioned function h is built by partitioning the domain D of the base function g ∈ DAL such that it becomes injective with respect to the partitions. Let π be the partition of the domain D of g that is formed by the equivalence relation<s.t∀a,b∈D,a<b⇔ g(a)=g(b).

Definition 5.4.8 (Set of the injective partitioned functions). —Given a set of total functions DAL from Dto L, let the set of injective partitioned functions

∆(D,L)⊆ P(D)ALbe the set of functions such that:

∆(D,L)={f | f ∈π→ L∧ ∀X,Y ∈π,f(X)= f(Y) =⇒ X= Y}

whereπis a partition of Dsuch that∀X ∈π,∀a,b ∈X : g(a) = g(b). The def-inition assumes that domainDcan be partitioned and therefore that its supports some kind of union∪Dand intersection∩Dto build a partition upon itself.

Internal operations such as union and intersection on the set of the injective partitioned functions must preserve the injectivity and the functional aspect. The definition of union and intersection are based on the definition of square union given Def.5.4.9.

Def. 5.4.9 to 5.4.11 originate from Mieg’s thesis [TM04] and enforces both properties. Formally, the definition of union and intersection are split in two parts:

1. Def. 5.4.10 and Def. 5.4.11 enforce the functional aspect by splitting el-ements that lead to different images while relying on the square union to

134 Chapter 5. Encoding&validating the model group those that lead to the same image,

2. Def. 5.4.9 enforces the injectivity of the function by grouping the pre-images that lead to the same image.

Although Def. 5.4.9 ensures that no two pre-image leads to the same image (i.e.,injectivity), it does not preserve the partition of the domain of the result (i.e.,the pre-images are not necessary disjoint). Indeed, a given value x ∈ D may belong to several partsX of the partition ofD. This is solved by Def.5.4.10

Definition 5.4.9 (Square Union of injective partitioned functions). —Given two domains D and L and two injective partition functions f,g ∈ ∆(D,L), the square union f ∪g∈∆(D,L) is defined by: Based on Def.5.4.9, Def.5.4.10defines the union of two injective partitioned function. Note that Def. 5.4.9requires the existence of a union law in the target domain.

Definition 5.4.10 (Union of injective partitioned functions). — Given two injective partition functions f,g∈∆(D,L), the union f ∪gis defined by:

5.4. Encoding AADTs —ΣDD 135 Similarly to Def.5.4.10, Def. 5.4.11defines the intersection of two injective partitioned function. Again, Def.5.4.11partitions the domainD.

Definition 5.4.11 (Intersection of injective partitioned functions). — Given two injective partition functions f,g∈∆(D,L), the intersection f ∩gis:

f ∩g = [

X∈Dom(f)

{X 7→ f(X)} ∩ [

Y∈Dom(g)

{Y 7→ g(Y)}

= G

X∈Dom(f)

G

Y∈Dom(g)

{X∩DY 7→ f(X)∩Lg(Y)}

It is of vital importance, at least from an implementation point of view, that the union and intersection operations do preserve the injectivity and therefore the canonicity.

Proposition 5.4.12 (Union and intersection of injective partitioned functions preserves injectivity and canonicity). — Given two function f,g ∈ ∆(D,L), f ∪gand f ∩gare injective partitioned functions and therefore injective and canonical.

Proof. The proof is rather tedious as it based on cases covered by the union (resp.

intersection) and the square union. It is detailed in [TM04].

We add the union and the intersection to the set of injective partitioned func-tions to form the lattice of the injective partitioned funcfunc-tions.

Definition 5.4.13 (Lattice of the injective partitioned functions). — Given a set of injective partitioned functions ∆(D,L), the lattice built over that set is h∆(D,L),∪,∩,0,>iwhere:

is the union as defined in Def.5.4.10;

is the intersection as defined in Def.5.4.11;

0is the constant function that associates any element of the partitionπto 0L;

>is the constant function that associates any element of the partitionπto>L;

136 Chapter 5. Encoding&validating the model Def. 5.4.14 defines♦ (resp. ) to encode (resp. decode) functions (resp. in-jective partitioned functions). Note that♦(resp. ) is a homomorphismsw.r.t.∪LD

(resp. ∪) and that they are inverse of each other. When the parameter applied to the encoding function is clear from the context, we may note♦ginstead of♦(g).

Definition 5.4.14 (Encoding/Decoding of the partitioned injective functions).

— Let f,g ∈ DAL be functions, and h ∈ ∆(D,L) be a injective partitioned function. The encoding function ♦ : DAL → ∆(D,L) that encodes a function into a injective partitioned function and the decoding function : ∆(D,L) → DALare defined such that:

♦f = The encoding function ♦relies on the union of injective partition function to ensure the partitioning of the domain as well as the injectivity. It follows from the previous definition and Def.5.4.10that♦andare homomorphisms onDALand

∆(D,L) with respect to the union on their respective lattice.

Proposition 5.4.15 (♦ and are homomorphisms). — Let f,g ∈ DAL be functions, andh, j∈∆(D,L) be an injective partitioned functions. The following properties hold:

♦0LD =0 and 0 =0LD (5.4.3)

♦(f ∪LD g) = ♦f ∪♦g (5.4.4)

(h∪ j) = h∪LD j (5.4.5)

5.4. Encoding AADTs —ΣDD 137 Proof. The proof of Equ. (5.4.3) is direct by Def. 5.4.14. Using Def. 5.4.14 and Def.5.4.10, the proof of Equ. (5.4.4) and Equ. (5.4.5) is straightforward and

detailed in AppendixA.2.1.

In order for the encoding/decoding of terms to remain consistent, we must prove that whenever we extract a previously encoded set of terms, it remains iden-tical. To prove the bijection, we first prove that♦◦is the identity morphism and thus that♦andare isomorphisms.

Proposition 5.4.16 (◦♦is the identity morphism on DAL× ∆(D,L)). —

∀f ∈ DAL,(♦(f)) = f and hence ◦♦is the identity morphism on DAL× Delta(D,L).

Proof of Proposition5.4.16. By using the homomorphy of ♦ and , Ap-pendixA.2.2proves that∀f ∈DAL,(♦(f))= f. Proposition 5.4.17 (♦◦is the identity morphism on∆(D,L)×DAL). —∀g∈

∆(D,L),♦((g))= gand hence♦◦is the identity morphism onDelta(D,L)× DAL.

Proof of Proposition5.4.17. The proof is similar to the one of Proposition5.4.16.

Based on the previous propositions, it is now possible to prove the isomorphy and therefore that the presented approach preserves the canonicity of the function encoding.

Theorem 5.4.18 (♦andare isomorphisms). —Let f ∈DALbe a function, andh ∈ ∆(D,L) be an injective partitioned functions. The following properties hold:

♦f = f (5.4.6)

♦h = h (5.4.7)

Proof of Theorem5.4.18. f : X → Y is called an isomorphism if there exists a morphism g : Y → X such that f ◦ g = idY and g ◦ f = idX. Based on (Proposition5.4.16) and (Proposition5.4.17) and given♦: DAL→ ∆(D,L) and :∆(D,L)→ DALwe haveidDAL= ♦◦andid∆(D,L)= ◦♦. And thus♦(resp.

) is an isomorphism ofDAL(resp. ∆(D,L)) to∆(D,L) (resp. DAL).

138 Chapter 5. Encoding&validating the model Corollary 5.4.19(Unicity of the injective partitioned function). Given a function f ∈DAL, there exists a unique function♦f ∈∆(D,L)that satisfies the properties of5.4.8.

The sequel of this section talks of a special kind of injective partitioned func-tion, the co-domain of which is inductively defined.

Inductive domains Applied to terms, injective partitioned functions only enable to reduce the number of pairs hoperator,subtermsi by grouping the vectors of subterms that have the same operator. One can go a step further and apply this principle inductively to the vector of subterms. Based on the injective partitioned functions, we can formalize and strongly type the Hierarchical Set Decision Diagrams[CTM05]. The main idea is that the domain of such a function is defined upon a Cartesian product of domains {D1,D2, . . . ,Dn} by applying inductively a curryfication principle. Let us first define the lattice of theIIPFsthat we call Inductive Injective Partitioned Functions (IIPFs). This helps to support the typing induced by the signature of the operators that are defined in a signature.

Hence, in Section5.4.3we are able to define what is aΣDDas anIIPFthat ranges over the domains (i.e.,the sorts) of the signature.

The Inductive Injective Partitioned Functions is inductively defined using a lattice with two elements as base case. The two elements{0 , 1} correspond re-spectively to the terminal 0 and the terminal 1 in the Decision Diagrams world.

Definition 5.4.19 (Lattice of the IIPFs). — Given a set of domains {D1,D2, ...,Dn}. The lattice of theIIPFsish∆D1,...,Dn,∪,∩,0,>iand is com-posed of the set of theIIPFs∆D1,...,Dn with theunion(∪) and the identity element 0similarly to the injective partitioned functions:

D1,...,Dn =





∆(D1,{0 , 1}) ifn=1

∆(D1,∆D2,...,Dn) ifn> 1. (5.4.8)

where the latticeh{ 0 , 1},∪,∩, 0, 1iis the base case.

Following the approach of manyDDs, from now on we only consider (textual or graphical) representations of functions that lead to 1.

Fig. 5.10 exhibits an inductive injective partitioned function of type∆D1,D2,D3

that has an inductively defined co-domain (∆D2,D3) where D1 = {+,−,∗, /},

5.4. Encoding AADTs —ΣDD 139 D2 = {0,1,2}, and D3 = {0,1,2}. One can remark that although 0 in D3 is the image of both 0 and 2 in D2, they are not grouped together. This is because 0 and 2 are two different images ofπ (and therefore two different functions) ofD1 indicating the relation to the previous inductive domains. The partitions {+,−}

and {∗} are disjoint and have different images and therefore 0 and 1 cannot be merged.

As shown in Fig.5.10,IIPFis naturally encoded as aDAG and not as a tree.

Fig.5.10bshows the graph representation of the function of Fig.5.10a. To see the potential sharing, let us imagine that the domainD3is itself a domain of function.

+

(b)Graph representation of the IIPF of5.10a Figure 5.10: IIPFand its graph representation.

Remark 5.4.3. — Given an IIPF ♦f we propose the following nota-tion that enumerates the partinota-tioned sets and links them to their image:

{ai→haim,...,aini, . . . ,aj→hajo,...,ajpi} where ai, . . . ,aj ∈ Dom(♦f), aim, . . . ,ain ∈ Dom(♦f(ai)) and ajo, . . . ,ajp ∈ Dom(♦f(aj)). Again, we only consider the part of functions that lead to{1}and we therefore omit it in the notation.

140 Chapter 5. Encoding&validating the model Example 5.4.18. —By using the previous notation, Fig.5.10ais written:

{{+,−}→h{1}→h

{1}i,{0}→h{0}i

i,{∗}→h{2}→h

{0}i

i} which represents the following set of tuples:{h+,1,1i,h+,0,0i,h∗,2,0i}

Composition of two IIPFsis noted·and is defined by· :∆D1,...,Dn×∆D01,...,D0m

D1,...,Dn,D01,...,D0m. So far we have seen operations on the IIPFssuch as union and

composition (concatenation). This is not enough to enable the user to handle com-plex problems such as rewriting. For that purpose, we propose to adapt the SDDs inductive homomorphisms to the IIPFs. Again, this redefinition enables the user to leverage strong typing in the formalization of user-defined functions. The be-havior of the function follows the inductive pattern of the IIPF domains and it handles the set aspect transparently (thanks to its homomorphic property).

The following definitions are transposed from notion of inductive homomor-phism, originally developed in [CTM05, CEPA+02] for the DDDsand SDDs frameworks, to the IIPFs world. Usually, we want to define the behavior of a functionΦaccording to the structure of its parameter. The structure is inductively defined and so should be the homomorphism.

Let us first remind that if Φ and Φ0 are homomorphisms, so are the union Φ∪ΦΦ0 and the composition Φ◦Φ Φ0 of homomorphisms. LetId be the iden-tity homomorphism on IIPFs such that∀♦f ∈ ∆(D1,D2), Id(♦f) = ♦f and let

Φ(resp.\Φ) be the intersection (resp. the difference) of two homomorphisms.

Definition 5.4.22 (Inductive Homomorphisms on IIPF). — Let us define a family of morphisms noted ΦDi,...,Dn from∆Di,...,Dn to ∆D0j,...,D0m withi, j,m,n ∈ N

This enforces thatφDi,ais itself an homomorphism.

5.4. Encoding AADTs —ΣDD 141 To define an inductive homomorphism onIIPFs, one has to define behaviors of θDn and φDi. By induction on the structure, it is straightforward to prove that

ΦDi,...,Dn is a homomorphism since it is only built upon compositions and unions

of homomorphisms.

Example 5.4.20. — The following homomorphism matches tuples of the form h{+},x,yiwhere xstands for any value of the domain D2 andy for any value of the domainD3. Furthermore, it returns anIIPFthat maps the encountered values to xandy. This homomorphism provides a trivial pattern matcher for terms.

ΦD1,...,D3 : Applied to the IIPFof Fig.5.10ait would produce the following result:

ΦD1,D2,D3(♦f) = ΦD1,D2,D3({+,−}→h{1}→h

Implementation details Since IIPFsare inductively defined, the worst case is to walk through the complete structure (for instance for comparison). To over-come this, a solution is to enforce unicity and canonicity. That is, whenever a new structure is created, the engine first checks whether a clone of this structure already exists in a cache. If it does, it discards the newly created structure and uses the cached structure instead. If done recursively, this operation is not very costly and provides unicity and thus comparison in constant time (hash consing, flyweight patterns). The application of the user-defined operations can also be cached. Indeed, as they are inductively defined and their operands are unique and canonical, each step can be cached by saving a pair of pointers h♦f,Φ(♦f)i (memoization pattern). If the same operation is performed again on the same pa-rameters, then the result can be extracted from the cache in a very efficient way

142 Chapter 5. Encoding&validating the model (constant time). Another very interesting side effect is that the computation of a fixed point becomes very efficient as it is reduced to test whether a given compu-tation already exists in the cache.