• Aucun résultat trouvé

Futamura Projections

Partial Evaluation

9.2 Futamura Projections

The Futamura projections are best explained diagrammatically. A transformational view of a program is a black box, Figure 9.2.1, into which input enters and from which output exits:

In

Out Prog

Fig 9.2.1 Simple black-box view of a program

This can be seen as an abstraction of a system in which a program runs on a machine.

Making the machine MachP explicit, the program is input to the machine, which executes code written in the language P. In Figure 9.2.2, dotted lines indicate how the view above is obtained by seeing the program and machine as a single system.

This is still an abstraction. In practice, a machine directly executes a different language from the one a program is written in. Call the machine language M and the machine that executes it MachM. A compiler is a program that takes a program written in P as input and returns a program in M as output. For any input, the compiled program running on MachM gives the same output as the program would if directly run on MachP. Alternatively, an interpreter, which is a program written in M, that takes two inputs, a program written in P and an input to that program and returns the same output that would be given if the program were run on MachP. These two approaches are shown in Figure 9.2.3. The dotted lines show a way of grouping the machine and the interpreter or compiler together so that the programmer need not be aware whether the program is interpreted or compiled.

The compiler itself could be broken down into a compilation program running on a machine, but this step is not needed here. The advantage of using a compiler over using an interpreter is that it makes explicit the intermediate stage of the compiled program. Being made explicit, we can store it and re-use it when needed with other inputs. Some of the work that the interpreter would have to do each time the program

is run is done once and for all when the compiled version is produced. This is illustrated in Figure 9.2.4.

In diagrammatic notation, partial evaluation may be defined as an equivalence between two diagrams in Figure 9.2.5, where Peval is the partial evaluator program (which again may be broken down into program and machine).

P r o g

M a c hP

In

O u t

Fig. 9.2.2 Program and machine as a single system

Prog

Compiler

MachM

Out

In In

Out Prog

Interpreter

MachM

Fig. 9.2.3 Compilation and interpretation

Prog

Compiler

Out In

In´

Out´

Compiled Prog

MachM

MachM

Fig. 9.2.4 Repeated runs of a compiled program

Prog

Peval Prog

Ink+1Inn Ink+1 Inn

In1Ink

Out Out

In1Ink

MachP

MachP

Fig. 9.2.5 Definition of partial evaluation

This holds for any value of Prog, the n inputs In1 to Inn and the appropriate output Out for Prog with the given inputs. The advantage is that the partially evaluated program output from Peval may be saved and used again. In particular, if we set the input In1 to Ink to static values a1 to ak, partial evaluation separates out an intermediate form of the input program described as “Prog partially evaluated with respect to In1=a1,… Ink=ak”, in which those elements of the computation specific to these static values are executed once and for all. That is, the intermediate form is a program written in P which requires n-k inputs and runs on MachP giving the same

results if it has inputs bk+1, …, bn as Prog would have if it were run on MachP with inputs a1, …, ak, bk+1, …, bn.

Peval Prog

Inn Ink+1

Out a1ak

a1 ak

Prog partially evaluated with In1= , … , Ink=

MachP

Fig. 9.2.6 The use of partial evaluation

Kleene [1952] shows that such a partial function can always be found as his s-m-n Theorem. It is assumed here that Peval is a P to P transformer and that we have a MachP machine to run P programs on. We can always substitute a compiler or interpreter and a MachM as above for MachP where necessary.

The first Futamura projection refers to partially evaluating an interpreter with respect to a particular program. Note that in this case the partial evaluator must be an M to M transformer. Otherwise this is just a special case of our more general partial evaluation with n=2, Interpreter the program being partially evaluated and In1 being set to the program being interpreted, Prog. The input to Prog is left dynamic. The result of the partial evaluation is a program written in M which when given input In runs on machine MachM and gives the same output as Prog gives when run on MachP with input In. In other words, the combination of partial evaluator and interpreter can be regarded as a compiler taking the program Prog written in P and returning a program which has the same behavior but is written in M; Figure 9.2.7 illustrates this. On the left-hand side of the figure the original use of the interpreter to run a P program on MachM is shown. The middle of the figure shows the use of partial evaluation to give a version of the interpreter specialized to the particular P program. The right hand side of the figure shows the use of a compiler, with the dotted lines indicating the equivalence.

MachM

Interpreter

Prog In

Out

Peval

Interpreter Prog

In In

Prog

Compiler

Out Out

Prog compiled from P to M

Prog compiled from P to M

MachM MachM

Fig. 9.2.7 The First Futamura Projection

The second Futamura projection recognizes that Peval itself may be regarded as a program running on MachM, rather than a machine-program system as regarded up till now. The second projection takes a program and some partial input and produces a specialized program. When specializing an interpreter with respect to a program, the two inputs are the interpreter and the program, and the output is the compiled program (Figure 9.2.8).

MachM

Interpreter Prog Peval

Prog compiled from P to M

Fig. 9.2.8 Inputs of the Second Futamura Projection

The second Futamura projection is the use of the partial evaluator on itself to give a version of the partial evaluator specialized with respect to a particular program P running on M interpreter. In Figure 9.2.9, a distinction is made between the angled-box Peval, which is the partial evaluator program itself and the rounded-angled-box Peval, which can be seen as a simplification standing for the combination of the program Peval and the machine MachM.

MachM

Peval Prog Peval

Interpreter

Prog compiled from P to M P to M compiler

Fig. 9.2.9 The Second Futamura Projection

The combination of partial evaluator applied to itself, enclosed within dotted lines Figure 9.2.9, can be seen to take a P running on M interpreter and produce a program which when applied to some program Prog, written in P, runs on MachM and produces a version of Prog compiled from P to M. In other words, a partial evaluator applied to itself gives a program, which takes a P to M interpreter and produces a P to M compiler. Since this will work for any interpreter, P can be considered a variable. The self-application of a partial evaluator may be considered a compiler generator, generating an equivalent compiler given an interpreter.

The third Futamura projection takes the second projection one stage further, expanding the program-machine Peval combination and then bringing in a further Peval, so that partially evaluating a partial evaluator with respect to itself gives an explicit compiler generator program. The complete system showing the three Futamura projections is shown in Figure 9.2.10.

In order to build such a system, a partial evaluator is required which is sufficiently powerful to be able to be applied to itself. The first such self-applicable partial evaluator was constructed by Jones, Sestoft and Søndergaard [1985]. The problem with self-applicable partial evaluators is that every aspect of the language that is used to program the partial evaluator must also be capable of being effectively treated in the partial evaluation. This is a particular difficulty in languages like Lisp and Prolog, which add to their basic declarative framework a lot of non-declarative features which require special handling during partial evaluation. One solution is to build a self-applicable partial evaluator in the declarative subset of the full language and then use a meta-interpreter to mix in the features of the full language.

Prog Interpreter

Prog compiled from P to M P to M compiler

MachM

Peval Peval

Peval

Compiler Generator

In

Out MachM

MachM

Fig. 9.2.10 The Third Futamura Projection

9.3 Supercompilation

The informal method of partial evaluation described above was formalized under the name supercompilation by another Russian researcher, Turchin [1986]. Turchin’s idea was that a program execution could be described as a graph of machine configurations. An arc between nodes represents each possible step in a program’s execution. In normal execution of a program, the configurations are ground and the graph is simply a chain from the initial configuration to the final one. In a partial execution, at any point where a program could transform to a number of states depending on some dynamic information, child states for each possibility are created.

The arcs leading to these child states are labeled with the value required for the dynamic data required for execution to follow that arc.

The above process is described as driving and would result in forming a decision tree.

If the decision tree were finite, a program could be reconstructed from it. The program would amount to a partial evaluation of the original program specialized for those values of data that were initially provided. In most cases, however, the decision tree would be infinite. Supercompilation is described as a specialized form of driving in which to prevent the attempted construction of infinite trees, loops in the graph are allowed. In particular, an arc to an existing configuration may be constructed if a configuration is found to be identical to another one in all but variable name. It can also be a specialization of it, that is, identical to the other one in all but variable name

or in having identified values at points that are variable in the other. This means that arcs have to be labeled not only with the conditions necessary for execution to take a particular arc, but also to which any variables in the destination configuration would be bound if execution went down that route. To ensure against the construction of infinite graphs, supercompilation also involves the concept of generalization.

Generalization occurs when two configurations are found to be “nearly identical”.

When this occurs, both are combined into one more general configuration with the values where they differ replaced by a variable and the arcs leading into it having bindings for this variable depending on which original configuration they led to.

For example, the graph in Figure 9.3.1 represents a version of insertion sort.

In

In Out Out:=[]

In Out X Stack

Out In=[]

In≠ []

X:=hd(In) In:=tl( In) Stack:=[]

In Out X Stack Out=[]

Stack=[]

In Out Stack

Out:=cons(X,Out) Stack≠ []

Out:=cons(hd(Stack),Out) Stack:=tl(Stack)

Xhd(Out)

X>hd(Out)

Stack:=cons(hd(Out),Stack) Out:=tl(Out)

Fig. 9.3.1 Graph of insertion sort

Here, the machine configurations are labeled with the sets of variables that represent the dynamic data of those configurations. Each configuration would also have static data representing the machine state, but this data can be discarded in the final diagram, leaving something similar to a finite state machine. The conditions on the arcs are labeled by comparison operators and equalities, the binding of variables by the := operator. The initial configuration has the single variable In, the final configuration the single variable Out. From this graph, a simple imperative program could be reconstructed:

0: read(In); Out:=[]; goto 1;

1: if In=[] then goto 5

else X:=hd(In); In:=tl(In); Stack:=[]; goto 2 endif

2: if Out=[] then goto 3

else if X<hd(Out) then goto 3

else Stack:=cons(hd(Out),Stack); Out:=tl(Out);

goto 2 endif

3: Out:=cons(X,Out); goto 4;

4: if Stack=[] then goto 1

else Out:=cons(hd(Stack),Out);

Stack:=tl(Stack); goto 4 endif

5: write(Out)

However, it would also be possible to construct a GDC program, having a separate actor for each state:

state0(In) :- state1(In,[]).

state1([],Out) :- state5(Out).

state1([X|In],Out) :- state2(In,Out,X,Stack).

state2(In,[],X,Stack) :- state3(In,Out,X,Stack).

state2(In,[H|Out],X,Stack) :- X<H

| state3(In,Out,X,Stack).

state2(In,[H|Out],X,Stack) :- X>H

| state2(In,Out,X,Stack).

state3(In,Out,X,Stack) :- state4(In,[X|Out],Stack).

state4(In,Out,[]) :- state1(In,Out).

state4(In,Out,[H|Stack]) :- state4(In,[H|Out],Stack).

state5(Out) :- write(Out).

Note that the bodies of the clauses all have just one actor, making the program what is called a binary program. However, it can be shown that non-binary logic programs can be transformed to binary logic programs [Demoen, 1992].

Turchin’s supercompilation can be explained as manipulations of the graphical notation. Firstly, if the static information of two states is identical, they can be merged into a single state, with changes in variable names on the incoming arcs as necessary:

Figure 9.3.2 represents the case where the first state with dynamic information X, Y and Z has already been produced, with appropriate outgoing arcs and the second state with dynamic information A, B and C is being produced and so has no outgoing arcs.

The second state is identified as equivalent to the first in all its static information and is merged with the first state, leaving the original outgoing arcs.

Secondly, if the static information in one state is a specialization of another, it can be merged with the other with the addition of arc labeling setting the variable value. For example, suppose the middle state below represents a situation where the static information is the same as the left state except that A and B are used for variables in the place of X and Z and that where the left state has dynamic Y it has constant n.

Then the right state represents the merger:

In the case where the more specialized state had been encountered first and outgoing arcs from it explored, these would have to be replaced by developing the new outgoing arcs for the more general state. So a decision has to be made: either to go ahead with the merger and abandon this previous work as “overspecialization”, or whether to keep it at the cost of a more complex graph.

A similar problem occurs when two states are found which differ only in one part of their static information. Generalization refers to making this static information where they differ dynamic. New outgoing arcs for the more general state must be developed.

Deciding when this generalization and merging of states is the preferable step to take is a key issue in partial evaluation. Figure 9.3.4 represents a situation where two states are found to represent identical static states except that one uses the variable names X and Y, the other A and B and also that at one point in the static information the first has information m while the second has n. The merger makes this static information difference dynamic, representing it by a new variable K. No outgoing arcs are given for the merged state, because new ones have to be developed to take account of this generalization.

A B A:=e A:=f

X Y

X:=a X:=b

X:=a K:=m

X K Y X:=b

K:=m X:=e K:=n

X:=f K:=n

Fig. 9.3.4 Generalization

Specialization is the opposite process to the above merging of states in the graph. In specialization, if an arc leading into a state assigns a value to a dynamic value in that state, a separate node making that value static is created. Figure 9.3.5 shows the effect of specializing the top state by instantiating its dynamic value V to static n. Arcs to the descendant states of the original state are copied, with the assignment passed down into these arcs:

V X Y Z X=a

U:=b

X=c U:=d

U V Y Z U V Y Z V:=n

X=c U:=d X=a

U:=b

X=a U:=b V:=n

X=c U:=d V:=n X Y Z

V X Y Z

U V Y Z U V Y Z

Fig. 9.3.5 Specialization

This specialization may be repeated by pushing it down to the next level. Specialized versions of the descendant states are created in the same way. The specialization can continue until all specializations are recognized as variants of previous states and ended in loops, in the way described above. Figure 9.3.6 shows the result of pushing specialization down one level.

V X Y Z X Y Z

A further transformation is possible when specializing in this way. When a state is specialized and the assignment making the specialization is passed to its outgoing arcs, any arc with an assignment setting a variable to one value and a condition requiring it to have another may be removed as contradictory. Figure 9.3.7 represents the specialization of a state with dynamic values W, X, Y and Z. By setting Y to n, the specialized state has no arc leading to the middle descendant state of the original state, as that arc requires Y to have the value m;

W X Y Z

Fig. 9.3.7 Specialization evaluating satisfied conditions

Given a program represented in the graphical form used here, a partial evaluation can be performed by adding a link leading to the state representing the initial state of the program which sets some of the variables in it to static values. Then the specializations can be passed down as indicated above, generalizing to form loops where necessary to ensure the process of specialization terminates. Having done this, any state in the transformed graph, which is not reachable from the initial specialized

state, is removed. This will lead us to a graph representing the specialized program alone, which can be converted back to standard program form.

Although explained graphically here, the techniques described may be found using different terminology in most partial evaluation systems and, indeed generally, in program transformation systems. The combining of nodes may be regarded as a form of folding and the specialization and hence multiplication of nodes as a form of unfolding [Burstall and Darlington, 1977]. Good explanations of partial evaluation in terms of specialization and generalization may be found for functional languages in the work of Weise et al. [1991] and for Prolog in the work of Sahlin [1993].