Problem Section: DCG Interpreters and Compilers

This digital edition of Prolog and Natural-Language Analysis is distributed at no charge for noncommercial use by Microtome Publishing.

175

item and represents the unit derived clause X(i,j). Ifγ=Y1· · ·Ymandβ=Yk· · ·Ym, item I represents the derived clause

X(i,p)⇐Yk( j,pk)∧ · · · ∧Ym(pm−1,p) .

6.6.8 Limitations of Tabular Parsers

We have just seen how loops and redundant analyses are avoided by using a tabu-lar parser rather than a depth-first backtracking one. However, these advantages are bought at the cost of having to store dotted items to represent intermediate states of the analysis explicitly. For context-free grammars, tabular parsers have an overwhelming advantage because dotted items can be efficiently encoded as triples of a rule number, position of the dot within the rule, and item start position. In contrast, storing a lemma (derived clauses) requires storing the bindings for the variables in the clause or clauses from which the lemma was derived. The Prolog proof procedure avoids these costs by considering only one alternative analysis at a time, but the whole point of tabular parsing and tabular proof procedures is to be able to use lemmas from one alternative proof path in other proof paths. In our example interpreters above, we use the implicit copying provided byassertto store lemmas with their corresponding bindings. More sophisticated schemes are available that may reduce the overhead, but in the worst case tabular parsing for DCGs is asymptotically as bad as top-down backtrack parsing, and substantially worse if one considers constant factors. On the whole, the decision be-tween tabular algorithms and Prolog for DCG parsing can only be done empirically with particular classes of grammar in mind.

Similar observations apply to the question of termination. Even though Earley deduction terminates for a larger class of programs than Prolog, it is easy to construct programs for which Earley deduction loops, such as the following DCG:

P(succ(x))→P(x) a

P(0) →b

Our Earley deduction procedure applied to the definite-clause representation of this grammar will loop in the predictor for any start symbol matching P(succⁿ(0)) for in-finitely many values of n. This is because the subsumption test on derived clauses stops loops in which the clause or a more general one already exists in the table, but this grammar predicts ever more specific instances of the first rule.

Exercise 6.11 Check that Earley deduction loops on this grammar.

6.7 Problem Section: DCG Interpreters and Compil-ers

Problem 6.12 Extend the DCG interpreter of Section 6.3.1 to handle the intersection operator as defined in Problem 2.13.

Problem 6.13 Write a compiler for the extended language of the previous problem.

Problem 6.14 The DCG compiler given by the combination of compile, partially_execute, andparse(the DCG interpreter) compiles a grammar by us-ing the partial executor to interpret the DCG interpreter. This process could be made more efficient by partially executing the partial executor with respect to the parse predicate, akin to compiling the DCG interpreter. Perform this compilation to yield a more efficient DCG compiler. What is lost in this process?

FG-DCG Analyses

Topicalization is a construction in English in which a filler constituent is prefixed to a sentence with a gap of the appropriate type. For the purposes of this problem, we will assume that the filler is always an NP. The following sentences exemplify the topicalization construction:

This book, Bertrand gave to Gottlob.

The professor that wrote this book, Alfred met.

The English left dislocation construction is a similar construction, except that instead of the empty string in the position associated with the filler, there is a pronoun (called a resumptive pronoun) filling that position, e.g.,

This book, Bertrand gave it to Gottlob.

The professor that wrote this book, Alfred met her.

Problem 6.15 Add FG-DCG rules to Program 6.5 to handle the English topicalization and left dislocation construction. Be careful to avoid the ungrammatical

*Bill read the book that Alfred wrote it.

Extending Earley Deduction with Restriction

In Section 6.6.8, we mentioned that even with the advantages of tabular parsing in be-ing able to parse left-recursive and other grammars not parsable by other means, there are still problematic grammars for the methods outlined. The subsumption test for stopping prediction loops requires that eventually a new entry will be no more specific than an existing one. But the sample grammar given in that section predicts rules with ever larger terms. One method for solving this problem is to limit the amount of struc-ture that can be passed in the prediction process, using a technique called restriction.

When a literal G is to be predicted, we look for rules that might be useful in resolving against G. But instead of performing this test by unifying G itself with the head of the rule, we first restrict G to G⁰by eliminating all but a finite amount of structure from G. The restricted version G⁰is then matched against possible rules. Since the amount of information in G⁰can be bounded, the nontermination problem disappears for the problematic cases discussed in Section 6.6.8.

There are many possible ways of restricting a term to only a finite amount of struc-ture. We might replace all subterms below a given depth (say 2) by variables. Then the

6.8. Bibliographic Notes

This digital edition of Prolog and Natural-Language Analysis is distributed at no charge for noncommercial use by Microtome Publishing.

177

termf(g(h(a), s(b)), c)would be restricted tof(g(X, Y), c). Another alter-native is to define restriction templates that eliminate certain information. For instance, the unit clause

restrict(f(g(A,B),c), f(g(X,Y),c)).

can be used to state the relationship between the sample term (and terms like it) and the restricted form.

Problem 6.16 Modify the Earley deduction program to perform restriction before pre-dicting using either of the methods of restricting terms. Test it on the problematic grammar of Section 6.6.8 to demonstrate that the algorithm now terminates.

6.8 Bibliographic Notes

Interpreters for Prolog-like languages in Prolog have been used since the early days of Prolog as a means to explore different execution regimes for definite clauses and for trying out extensions to the language (Gallaire and Lasserre, 1982; Porto, 1982;

L. M. Pereira, 1982; Mukai, 1985). One of the advantages of this approach is that it is not necessary to construct an interpreter for all the features of the new language because the aspects not being investigated are just absorbed by the underlying Prolog implementation.

The technique we suggest for a Prolog-in-Prolog with cut (Section 6.2.2) seems to have been first used in a version of the interpreter in the DEC-10 Prolog system due to David H. D. Warren, although it might have been in the folklore before then. A version is given by O’Keefe (1985).

Consecutively bounded depth-first search (Section 6.2.1) has been described and analysed by Stickel and Tyson (1985) and, under the name “depth-first iterative deep-ening” by Korf (1985).

Compilation by partial execution (Section 6.4) has been discussed in a logic pro-gramming context by Kahn (1982) and by Takeuchi and Furukawa (1985). However, much of what is done in this area by logic programming researchers is still unpub-lished, so our particular approach to the problem is to a great extent independently derived.

Left-corner parsers for context-free languages were discussed in a form close to the one used here (Section 6.5) by Rosenkrantz and Lewis (1970), although the basic idea seems to be earlier. The subject is also extensively covered in the exercise sections of Aho and Ullman’s textbook (1972). Rosenkrantz and Lewis introduce an algorithm that transforms a context-free grammar to an equivalent one in which nonterminals are pairs of nonterminals of the original grammar. Left-corner derivations for the initial grammar correspond to top-down derivations for the new grammar. The BUP parser for definite-clause grammars (Matsumoto, et al., 1983) uses a similar technique, ex-cept that the nonterminal pairs are instantiated at run time rather than at grammar compilation time. Thelinkrelation (Section 6.5.1) gives a finite approximation of the in general infinite set of DCG nonterminals that would be the result of applying the Rosenkrantz and Lewis process to a DCG. Pratt (1985) developed a tabular parser based on similar notions.

Tabular parsers for context-free languages are the result of the application of

“divide-and-conquer”, dynamic-programming methods to the context-free parsing problem to avoid the exponential costs of backtracking (Aho and Ullman, 1972). The Cocke-Kasami-Younger (CKY) algorithm is the first of these, but it does not use any top-down predictions so it will generate many useless subphrases. Earley’s algorithm (1970; Aho and Ullman, 1972) uses left-context to its full extent so that any recog-nized subphrase is guaranteed to fit into an analysis of a sentence having as a prefix all the input symbols seen so far. The algorithm of Graham, Harrison, and Ruzzo (1980;

Harrison, 1978) combines a generalization of the CKY algorithm with preconstructed top-down prediction tables to achieve the best practical performance known so far for a general context-free parsing algorithm.

The Earley deduction proof procedure is due to D. H. D. Warren (1975), but the first published discussion of the procedure and its application in natural-language parsing is given by Pereira and Warren (1983). The trade-offs between termination and detail of top-down prediction are discussed by Shieber (1985c) for a class of formalisms with similar properties to definite-clause grammars. A further difficulty with the extended Earley’s algorithm is the cost of maintaining rule instantiations, which does not oc-cur in the original algorithm because grammar symbols are atomic. Boyer and Moore invented an instantiation-sharing method for clausal theorem provers (1972). The spe-cial constraints of parsing allow some further optimizations for their method (Pereira, 1985).

The idea of parsing from the heads of phrases outwards has often attracted atten-tion, even though its computational merits are still to be proven. Instances of this idea are McCord’s slot grammars (1980) and head-driven phrase-structure grammar (Sag and Pollard, 1986), and the use of a head-selection rule for DCGs (Pereira and Warren, 1983).

Topicalization and left dislocation are discussed by Ross (1967).

This digital edition of Prolog and Natural-Language Analysis is distributed at no charge for noncommercial use by Microtome Publishing.

Appendix A

Listing of Sample Programs

This digital edition of Pereira and Shieber’s Prolog and Natural-Language Analysis is distributed at no charge by Microtome Pub-lishing under a license described in the front matter and at the web site. A hardbound edition (ISBN 0-9719997-0-4), printed on acid-free paper with library binding and including all appendices and two indices (and without these inline interruptions), is available from www.mtome.comand other booksellers.

This appendix includes commented listings of the talkprogram developed in Chapter 5 and the DCG compiler of Chapter 6. Besides combining all of the bits of code that were distributed throughout that and other chapters, this listing provides an example of one commenting style for Prolog.

A.1 A Note on Programming Style

We have adopted the following stylistic conventions in the programs in this appendix and elsewhere in the book. Although these particular conventions are not sacrosanct, adherence to some set of uniform conventions in Prolog programming (and, indeed, for programming in any language) is desirable.

We attempted to use variable names that are long enough to provide some mnemonic power. Predicate names were chosen to be as “declarative” in tone as possi-ble (without sacrificing appropriateness). Thus, we used the nameconc(for “concate-nation”) rather than the more common, procedural termappend. Of course, certain predicates which rely on side-effects are more appropriately named with procedural terms such asread_wordorprint_reply.

Conventionally, the words in multi-word variable names are demarcated by capital-ization of the first letter, e.g.,VariableName. Multiple words in functor symbols, on the other hand, are separated with underbar, e.g.,multiple_word. These conventions are relatively widespread in Prolog culture.

As mentioned in Section 3.4, we use the Prolog notational convention of giving a name beginning with an underbar to variables whose role is not to pass a value but merely to be a place holder. Anonymous variables (notated with a single underbar) are used for place-holder variables for those rare occasions in which naming the variable

179

would detract from program readability. Such occasions occurred only in two areas:

in specifying the tables for lexical entries and in listing generic forms for auxiliary literals.

Despite statements to the contrary, no programming language is self-documenting.

Since the sample programs presented in the text have been surrounded by a discus-sion of their operation, no comments were interspersed. However, the commenting of programs is an important part of programming.

The commenting style used here includes an introductory description of each pred-icate defined, including a description of its arguments. The normal mode of execution of the predicate is conveyed by the direction of arrows (==>or<==) for each argument.

In addition, the individual clauses are commented when appropriate.

It is usually preferable to place comments pertaining to a particular literal on the same line as the literal as is done, for instance, in the commented version of main_loopbelow. Unfortunately, page width limitations necessitated interleaving these comments in many cases.

In general, cuts, asserts, and similar metalogical operations are suspect in Prolog code. In the programs that follow, cuts are used only to encode conditionals. The conditional construct, though typically preferred, was not used in several cases because it was deemed less readable than the version using the cut. Asserts in these programs are not used as part of the program’s control strategy, but rather, as the output of meta-programs.

Dans le document Prolog and Natural-Language Analysis D (Page 185-190)