• Aucun résultat trouvé

Partial Evaluation of an Interpreter

Partial Evaluation

9.9 Partial Evaluation of an Interpreter

As an extended example of partial evaluation, we consider the use of the techniques we have described on a GDC interpreter for a simple imperative language. The effect is to translate programs written in that language to GDC and also to partially evaluate them should they be supplied with partial input. As we have mentioned, some early work in partial evaluation, such as that of Ershov [1980] did consider the direct partial evaluation of imperative languages. Later work has tended to go through a declarative language as a meta-language. Ross, for example [1988] uses Prolog.

It is possible and useful, to define an imperative language using predicate logic, in terms of the values held by variables, or environment, before and after the execution of statements in the language [Hehner et al., 1986]. If the logic used is in the form of a logic program, the definition serves a dual purpose. Firstly it is a specification of the semantics of the imperative language, but secondly, as it is executable, it serves also as an interpreter for the imperative language.

Here an interpreter for an imperative language written in a actor language is partially evaluated with respect to some program in that imperative language. This gives us a actor program that has the same semantics as the original imperative program. If we partially supply the input data to the imperative program, this data will further specialize the residual program. This is a different approach from Ross, who uses a

separate translator from an imperative language to a logic program and then uses the full partial evaluation algorithm on the resulting program.

Adding a back-end which converts the final logic program back to the imperative language would complete the process whereby an logic language is used to partially evaluate an imperative language, although this last stage is not covered here (it has been considered in [Huntbach, 1990]).

The present use of an actor language rather than Ross’s Prolog gains several advantages. Firstly, where the sequentiality of a standard imperative language does not effect its semantics, the sequentiality disappears in the concurrent specification.

As a result, the conversion process automatically introduces parallelism into the program. This has the potential of reverse engineering [Chikofsky and Cross, 1990]

“dusty deck” imperative programs into versions that can exploit the new generation of parallel architectures. Secondly, it extends the techniques to imperative languages that already embody parallel processing.

The idea of using a meta-language as the basis for a program manipulation system is also a part of the Programmer’s Apprentice project [Waters, 1985]. In [Huntbach, 1990] an equivalence was shown between the plan notation of the Programmer’s Apprentice and logic programming. The present method of language translation through partial evaluation is similar to the abstraction and reimplementation method of Waters [1988], though whereas the Programmer’s Apprentice uses an ad hoc library of programming clichés to transform its meta-language version of the program, a rather more systematic transformation method is used.

Statements in an imperative language are specified in terms of a relation between an input environment and an output environment, where an environment is simply a list of name-value pairs. The figure below gives a BNF for a simple imperative language involving just assignments, loops and conditionals. Actual symbols of the language are given in bold type, non-terminal symbols within angled brackets. The BNF is such that programs in the language are also valid terms in an actor language, therefore no parser is required (it is assumed that, as in Prolog, certain operators, such as the arithmetic operators, are predefined as infix). For simplicity, the BNF suggests that arithmetic expressions may only take a simple <Variable><Operator><Variable>

form, though it would not be difficult to extend it to cover full arithmetic expressions.

Variables are represented by strings beginning with lower case letters and are thus distinct from the channels of the meta-language. This is, of course, essential, since imperative variables may be reassigned values, whereas channels have a single-assignment property. It is also assumed that the language provides simple list handling primitives. There is no type checking.

<Program> : := <Block>

<Block> : := []

<Block> : := [ <Statement> {, <Statement> } ]

<Statement> : := if(<Condition>,<Block>,<Block>)

<Statement> : := while(<Condition>,<Block>)

<Statement> ::= <Variable> := <Expression>

<Expression> : := <Variable> <ArithOp> <Variable>

<Expression> : := hd(<Variable>)

<Expression> := tl(<Variable>)

<Expression> : := cons(<Variable>,<Variable>)

<Expression> := []

<Condition> := empty(<Variable>)

<Condition> := <Variable> <CompOp> <Variable>

The semantics for the language are given by the GDC interpreter for it. The semantics for an assignment statement give an output environment in which the variable on the left-hand-side of the assignment is linked with the result obtained from evaluating the expression on the right-hand-side of the assignment in the input environment, with all other elements of the environment unchanged. A conditional statement has the semantics of either of its two substatements depending on how its condition evaluates in the input environment. The semantics for a while statement involve the standard logic programming conversion of iteration to recursion. Sequencing is achieved by taking the output environment of the first statement as the input for the rest. In the GDC interpreter below code for arithmetic operators and comparison operators other than plus and greater then is not given, but will follow the same pattern as plus and greater than. It would be possible to extend the interpreter considerably to give a more realistic imperative language, or to add other desired features. In [Huntbach, 1991], for example, a version of it is given which includes Occam-style guarded commands.

block(InEnv,[],OutEnv) :- OutEnv=InEnv.

block(InEnv,[Stat|Statements],OutEnv) :- statement(InEnv,Stat,MidEnv),

block(MidEnv,Statements,OutEnv).

statement(InEnv,Var:=Expr,OutEnv)

:- evalinenv(InEnv,Expr,Val), replace(Var,Val,InEnv,OutEnv).

statement(InEnv,while(Cond,Block),OutEnv) :- evalinenv(InEnv,Cond,TruthVal),

loop(TruthVal,InEnv,Cond,Block,OutEnv).

statement(InEnv,if(Cond,Block1,Block2),OutEnv) :- evalinenv(InEnv,Cond,TruthVal),

switch(TruthVal,InEnv,Block1,Block2,OutEnv).

switch(true,InEnv,Block1,Block2,OutEnv) :- block(InEnv,Block1,OutEnv).

switch(false,InEnv,Block1,Block2,OutEnv) :- block(InEnv,Block2,OutEnv).

loop(false,InEnv,Cond,Block,OutEnv) :- OutEnv=InEnv.

loop(true,InEnv,Cond,Block,OutEnv) :- block(InEnv,Block,MidEnv),

evalinenv(MidEnv,Cond,TruthVal),

loop(TruthVal, MidEnv, Cond, Block, OutEnv).

evalinenv(Env,N,V) :- integer(N) | V:=N.

evalinenv(Env,N,V) :- string(N) | lookup(N,Env,V).

evalinenv(Env,E1+E2,V)

:- evalinenv(Env,E1,V1), evalinenv(Env,E2,V2), V:=V1+V2.

evalinenev(Env,E1>E2,V)

:- evalinenv(Env,E1,V1), evalinenv(Env,E2,V2), gt(V1,V2,V).

evalinenv(Env,not(E),V)

:- evalinenv(Env,E,V1), negate(V1,V).

evalinenv(Env,hd(E),V)

:- evalinenv(Env,E,V1), head(V1,V).

evalinenv(Env,tl(E),V)

:- evalinenv(Env,E,V1), tail(V1,V).

evalinenv(Env,cons(E1,E2),V) :- evalinenv(Env,E1,V1),

evalinenv(Env,E2,V2), cons(V1,V2,V).

evalinenv(Env,empty(E),V)

:- evalinenv(Env,E,V1), null(V1,V).

gt(V1,V2,V) :- V1>V2 | V=true.

gt(V1,V2,V) :- V1=<V2 | V=false.

negate(true,V) :- V=false.

negate(false,V) :- V=true.

cons(H,T,L) :- L=[H|T].

head([H|T],X) :- X=H.

tail([H|T],X) :- X=T.

null([],V) :- V=true.

null([H|T],V) :- V=false.

lookup(Var,[(Var1,Val1)|Env],Val) :- Var=Var1

| Val=Val1.

lookup(Var,[(Var1,Val1)|Env], Val) :- Var=\=Var1

| lookup(Var,Env,Val).

replace(Var,Val,[(Var1,Val1)|Env1],Env) :- Var=\=Var1

| replace(Var,Val,Env1,Env2), Env=[(Var1,Val1)|Env2].

replace(Var,Val,[(Var1,Val1)|Env1], Env) :- Var==Var1

| Env=[(Var,Val)|Env1].

As an example of partial evaluation of this interpreter, a simple list reversal program is used which uses a loop that builds up on an accumulator. The imperative code is:

while(not(empty(in), [ acc:=cons(hd(in),acc), in:=tl(in)

]

with in the input list, acc initially set to the empty list and the final value of acc returned as output. This can be achieved by adding the behavior:

reverse(L,R)

:- block([(in,L),(acc,[])], [while(not(empty(in),

[acc:=cons(hd(in),acc), in:=tl(in)])], Env), lookup(acc,Env,R).

to the interpreter and then partially evaluating the actor reverse(L, R).

The actor will unfold to the set of actors:

:- null(L,V), negate(V,V1),

loop(V1,[(in,L),(acc,[])],not(empty(in)), [acc:=cons(hd(in),acc),in:=tl(in)],Env), lookup(acc,Env,R).

The null actor is suspended because it cannot rewrite further while L is unbound. The negate actor is suspended as V is unbound. The loop actor is suspended as V1 is unbound and the lookup actor is suspended as Env is unbound. The loop actor can be specialized, but as it is dependent on the negate actor we leave it for fusion. At this stage the null actor and negate actor are fused to remove the variable V. This gives the new actor null_negate(L,V1), with behaviors (before the partial evaluation of their bodies):

null_negate([],V1) :- V=true, negate(V,V1).

null_negate([H|T],V1) :- V=false, negate(V,V1).

In both cases, as the channel V does not occur in the head of the behaviors, so we can go ahead with sending the message, then unfold the negate actor, giving:

null_negate([],V1) :- V1=false.

null_negate([H|T],V1) :- V =true.

The next stage fuses null_negate(L,V1) with the loop actor. This will give the actor null_negate_loop(L,Env) with behaviors, before partial evaluation of their bodies:

null_negate_loop([],Env) :- V1=true,

loop(V1,[(in,[]),(acc,[])],not(empty(in)), [acc:=cons(hd(in),acc),in:=tl(in)],Env).

null_negate_loop([H|T], Env) :- V1=false,

loop(V1,[(in,[H|T]),(acc,[])],not(empty(in)), [acc:=cons(hd(in),acc),in:=tl(in)],Env).

Note how the L in the loop actor has been bound in the first behavior to [] and in the second to [H|T] as this channel was shared with the null_negate actor and the

arguments with which it matches in the behaviors for null_negate are so bound. As with the formation of null_negate, the assignments within the bodies may take place and this enables the loop actors within the bodies to be unfolded until the point is reached where they have unfolded to actors which are all suspended:

null_negate_loop([],Env) :- Env=[(in,[]),(acc,[])].

null_negate_loop([H|T],Env) :- null(T,V),

negate(V, V1),

loop(V1,[(in,T),(acc,[H])],not(empty(in)), [acc:=cons(hd(in),acc),in:=tl(in)],Env).

Note in the second behavior that the lookup and replace actors altering the environments are completely evaluated in the unfolding. This gives an environment in which in is paired with the tail of the initial list and acc is paired with a list of one element, the head of the initial list.

At this point, null(T,V) and negate(T,V1) are fused in the second behavior and this is found to be equivalent in all but channel names to the previous fusion of a null and negate actor, so it becomes null_negate(T,V1) with no further definition of behaviors. When null_negate(T,V1) is fused with the loop actor, however, a divergence is found. It is equivalent to the previous fusion of a null_negate with a loop actor, except that the channel T occurs in the place of the previous L and also in the place of [] there is now [H]. This latter condition means that the previous fusion is abandoned as an over-specialization. In the place of null_negate_loop(L,Env), put:

null_negate_loop1(L,[],Env)

where the behaviors for null_negate_loop1(L,A,Env) are obtained from the fusion of null_negate(L,V1) and

loop(V1, [(in,T),(acc,A)], not(empty(in)), [acc:=cons(hd(in),acc), in:=tl(in)], Env)

That is, we have abstracted out the specific value of the accumulator.

The behaviors for null_negate_loop1, following unfolding of the actors in their bodies but not actor fusion in the bodies are:

null_negate_loop1([],A,Env) :- Env=[(in,[]),(acc,A)].

null_negate_loop1([H|T],A,Env) :- null(T,V),

negate(V,V1),

loop(V1,[(in,T),(acc,[H|A])],not(empty(in)), [acc:=cons(hd(in),acc),in:=tl(in)],Env).

As previously, null(T,V) and negate(V,V1) will fuse to give null_negate(T,V1) but this time when this is fused with the loop actor, it will be detected as a version of the previous fusion to form null_negate_loop1 with [H|A] matching against A. A channel matching against a tuple in this way does not require further generalization so the fusion becomes the recursive call:

:- null_negate_loop1([H|T],[H|A],Env), with no further need to define behaviors.

So the position we are now in is that we have top level actors:

:- null_negate_loop1(L,[],Env), lookup(acc, Env, R)

with the behaviors for null_negate_loop1:

null_negate_loop1([],A,Env) :- Env=[(in,[]),(acc,A)].

null_negate_loop1([H|T],A, nv)

:- null_negate_loop1(T,[H|A],Env).

At the top level we can now fuse the two actors, removing the variable Env. This leaves us with the single actor null_negate_loop1_lookup(L,[],R) with behaviors, initially:

null_negate_loop1_lookup([],A,R)

:- Env=[(in,[]),(acc,A)], lookup(acc,Env,R).

null_negate_loop1_lookup([H|T],A,Env) :- null_negate_loop1(T,[H|A],Env),

lookup(acc,Env,R).

In the first of these behaviors, the assignment can be executed and:

lookup(acc,[(in,[]),(acc,A)],R)

unfolds completely to R=A. In the second behavior, the fusion of null_negate_loop1(T,[H|A],Env) and lookup(acc,Env,R) matches the previous fusion and so becomes the actor:

null_negate_loop1_lookup(T,[H|A],R).

This leaves null_negate_loop1_lookup(L,[],R) as the residual actor with behaviors:

null_negate_loop1_lookup([],A,R) :- R=A.

null_negate_loop1_lookup([H|T],A,R)

:- null_negate_loop1_lookup(T,[H|A],R).

It can be seen that we have the standard logic program for reverse with an accumulator. The interesting point is that the handling of the environment took place entirely within the partial evaluation. Partial evaluation has had the effect of compiling away the overhead associated with managing an environment.

Also note that the interpreter for the imperative language detects any implicit potential parallelism and converts it to the real parallelism of GDC. This implicit parallelism respects the data dependency of the sequencing of statements giving parallelism only where it has no effect on the result of the execution. Consider the execution of the assignments x:=exp1, y:=exp2, z:=exp3 where exp1, exp2 and exp3 are arbitrary expressions. The call

:- block(Env1,[x:=exp1,y:=exp2,z:=exp3],OutEnv) will unfold to:

:- V1 = evaluation of exp1 in Env1, V2 = evaluation of exp2 in Env2, V3 = evaluation of exp3 in Env3, replace(x,V1,Env1,Env2), replace(y,V2,Env2,Env3), replace(z,V3,Env3,OutEnv) Suppose the initial environment (Env1) is:

[(x,1),(y,2),(z,3),(a,4),(b,5),(c,6)]

then the calls to replace will unfold giving:

:- V1 = evaluation of exp1 in Env1, V2 = evaluation of exp2 in Env2, V3 = evaluation of exp3 in Env3,

Env2 = [(x,V1),(y,2),(z,3),(a,4),(b,5),(c,6)], Env3 = [(x,V1),(y,V2),(z,3),(a,4),(b,5),(c,6)], OutEnv = [(x,V1),(y,V2),(z,V3),(a,4),(b,5),(c,6)]

If exp1, exp2 and exp3 contain references to a, b and c only, they may be evaluated in parallel. If however, exp2 contains references to x it will use V1 for x and evaluation will be suspended until V1 is bound by evaluation of exp1. Similarly, evaluation of exp3 will be halted if it contains references to x or y. In contrast a purely parallel execution of the assignments which does not respect data dependency could have indeterminate effect if exp2 contains references to x or exp3 contains references to x or y. The effect is as if the assignments were carried out in any order.

If the assignments were [x:=y,y:=9,z:=2*x+y], the result would be that x is linked with either 2 or 9 and z any of 4, 6, 13 or 27.

The interpreter presented retains those elements of the sequential execution which are a necessary part of the semantics, but not those which are not, hence the process of partial evaluation into actors may also be used as an automated paralleliser. We have separated out what has been termed the kernel from the semantically unnecessary control [Pratt, 1987].

M.M. Huntbach, G.A. Ringwood: Agent-Oriented Programming, LNAI 1630, pp. 247–278, 1999.

© Springer-Verlag Berlin Heidelberg 1999